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PREFACE 


This  is  the  final  report  of  a  project  carried  out  at  the  US  Army 
Tropic  Test  Center,  aimed  at  improving  the  measurement  of  subjective 
test  variables  in  human  factors  evaluation.  The  work  was  supported  by 
the  US  Army  In-House  Laboratory  Independent  Research  Program.  US  Army 
Infantry  soldiers  who  participated  in  the  field  studies  of  magnitude 
estimation  were  from  the  193d  Infantry  Brigade  (Canal  ‘Zone).  This 
project  was  conceived  by  Or.  D.  A.  Dobbins,  former  Chief,  Technical 
Division,  USATTC,  and  preliminary  work  was  done  by  Roger  L.  William¬ 
son,  USATTC  staff.  Assistance  was  given  in  gathering  and  analyzing 
magnitude  estimation  data  in  the  laboratory  studies  by  Charles  M. 
Kindi ck,  USATTC  staff. 


SUMMARY 


Under  the  In-House  Independent  Laboratory  Research  (ILIR)  Program, 
the  US  Army  Tropic  Test  Center  conducted  an  investigation  of  cross¬ 
modality  matching  methods,  adapted  from  those  used  in  studies  of  the 
measurement  of  sensations  in  the  field  of  psychophysics,  for  use  in 
measuring  subjective  variables  in  human  factors  evaluations.  Magni¬ 
tude  estimation  was  selected  as  the  desired  response  mode,  and  a  se¬ 
ries  of  laboratory  studies  of  magnitude  estimation  of  line  lengths  was 
carried  out.  Three  field  studies  were  also  conducted  using  magnitude 
estimation  to  measure  subjective  variables  important  in  human  factors 
evaluations.  USATTC  concluded  that  magnitude  estimation  is  a  suitable 
and  practical  method  for  measuring  subjective  variables  in  human  fac¬ 
tors  evaluations,  and  that  this  method  measures  these  variables  better 
than  the  usual  rating  and  ranking  methods. 
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SECTION  I.  BRIEF  OF  RESULTS  AND  CONCLUSIONS 


The  method  of  cross-modality  matching,  as  developed  in  the  field 
of  psychophysics  for  the  measurement  of  sensation,  was  carefully  exam¬ 
ined  as  a  possible  method  for  measuring  subjective  variables  in  human 
factors  evaluations.  Magnitude  estimation  was  selected  as  the  most 
feasible  response  mode  because  of  its  simplicity,  and  the  fact  that 
everybody  is  familiar  with  the  number  scale  and  can  readily  give  num¬ 
bers  as  estimates  of  feelings  or  opinions  on  a  subjective  variable. 

A  series  of  laboratory  experiments  with  magnitude  estimation  of 
line  length  was  carried  out  in  order  to  see  whether  the  results  ob¬ 
tained  wer.e  in  accord  with  those  reported  in  the  literature  of  psycho¬ 
physics.  It  was  found  that  magnitude  estimates  of  line  length  were 
very  well  fitted  by  a  power  function  and  that  the  exponent  of  the  pow¬ 
er  function  which  best  fitted  the  data  was  approximately  .92  to  .94. 
These  results  were  very  similar  to  those  reported  in  the  psychophysics 
literature. 

The  data  on  magnitude  estimation  of  line  lengths  we-  examined  for 
evidence  of  the  stability  (reliability)  of  measurement,  and  it  was 
found  that  stability  of  measurement,  at  the  level  of  group  means  with 
N  =  12,  was  quite  satisfactory.  Intraclass  correlation  coefficients 
of  ,94  were  obtained. 

Three  field  studies  of  magnitude  estimation  were  conducted.  One 
involved  a  comparison  of  the  Personnel  Armor  System  for  Ground  Troops 
(PASGT)  helmets  and  vest  with  the  standard  helmet  and  vest  with  respect 
to  comfort.  A  second  study  involved  a  comparison  of  four  different 
machine  guns  with  respect  to  perceived  accuracy  and  ease  of  operation. 
In  the  third  study  soldiers  carried  loads  ranging  from  20  to  50  pounds 
(9.1  to  22.7  kg)  over  a  4-kilometer  jungle  course  and  then  were  asked 
to  give  magnitude  estimates  of  the  difficulty  of  several  parts  of  the 
course.  The  results  of  the  field  studies  on  magnitude  estimation 
agreed  with  other  measures  of  subjective  variables  and  with  objective 
measures  of  performance,  when  these  were  relevant  to  the  subjective 
variables  being  measured. 

Magnitude  estimation  provided  more  precise  measurement  of  subjec¬ 
tive  variables  than  the  usual  rating  and  ranking  methods,  in  that  it 
provided  measurement  on  a  ratio  scale,  as  compared  with  the  ordinal 
scale  measurement  provided  by  the  usual  methods  for  measuring  subjec¬ 
tive  variables.  It  was  also  noted  that  magnitude  estimation  is  a  rel¬ 
atively  easy  and  practical  method  of  gathering  data  on  subjective  var¬ 
iables. 

Further  comparisons  should  be  made  between  magnitude  estimation 
data  and  those  obtained  from  the  usual  rating  and  rankinn  methods  of 
measuring  subjective  variables,  as  well  as  with  relevant  objective 
measures  of  performance.  Also,  further  empirical  and  theoretical  in¬ 
vestigations  should  be  conducted  concerning  appropriate  methods 
statistical  analysis  for  magnitude  estimation  data. 
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SECTION  II.  INTRODUCTION 


Human  factors  evaluation  of  military  equipment  and  materiel  in¬ 
volves  both  objective  measures  of  performance  and  subjective  measures, 
such  as  those  relating  to  comfort,  preference  and  confidence.  In  an 
effort  to  gain  acceptance  and  status  among  their  professional  peers, 
human  factors  specialists  have  tended  to  use  objective  or  "hard"  meas¬ 
ures  of  performance  as  much  as  possible.1  However,  it  has  always 
been  necessary  to  use  subjective  measures  ir  the  area  of  soldier  ac¬ 
ceptance  of  equipment  and  materiel.  What  a  soldier  thinks  or  feels 
about  a  piece  of  equipment  is  likely  to  have  a  strong  influence  on  how 
effectively  he  uses  that  equipment. 

Work  on  a  new  approach  to  measuring  subjective  test  data  in  the  US 
Army  test  and  evaluation  setting  was  begun  by  Williamson  and  Dob¬ 
bins.  2  They  completed  an  extensive  review  of  recent  literature  and 
laid  out  several  steps  to  be  accomplished  in  carrying  out  the  pro¬ 
ject.  This  report  conti nuer  the  work  on  improved  methods  for  measur¬ 
ing  subjective  test  data  in  the  Army  test  and  evaluation  setting. 

Subjective  measures  have  the  reputation  of  being  "soft"  in  con¬ 
trast  to  "hard"  measures  of  performance.  Many  people  are  inclined  to 
place  less  reliance  on  subjective  measures  than  on  objective  meas-  f f 

ures.  There  are  several  reasons  for  the  suspicion  and  uneasiness  in 
regard  to  subjective  measures:  (1)  subjective  measures  are  likely  to 
be  much  more  variable  than  objective  measures  because  they  are  more 
susceptible  to  the  effects  of  uncontrolled  variables  which  cannot  be 
anticipated.  This  means  that  relatively  sophisticated  experimental 
designs  and  methods  of  statistical  analysis  must  be  used  with  subjec¬ 
tive  measures,  and  some  people  may  have  difficulty  understanding  re¬ 
sults  presented  in  these  terms;  (2)  the  development  of  subjective 
measures  requires  great  effort  to  communicate  clearly  to  subjects  the 
meaning  of  the  subjective  variables  on  which  they  are  to  provide  da¬ 
ta.  Frequently,  not  enough  effort  is  made  with  the  result  that  infer¬ 
ior  measures  are  often  used  and  the  reputation  of  subjective  measures 
suffers  accordingly;  (3)  the  questionnaire,  interview  and  ranking 
methods  used  in  gathering  subjective  test  data-1  yield  only  ordinal 
data;  that  is,  data  which  tell  us  only  "more  than"  or  "less  than,"  and 
not  "how  much"  more  than  or  less  than.  This  last  reason  involves 
the  quality  of  measurement,  and  the  objective  of  the  project  described 
in  this  report  is  to  improve  the  quality  of  subjective  measurement. 

1  Klein  David,  "Social  Aspects  of  Exposure  to  Highway  Crash,"  Human 

Factors,  pp.  211-219. 

2  Williamson,  R.  L.,  and  Dobbins,  D.  A.  A  New  Approach  Toward  Quan¬ 
tifying  Subjective  Test  Data. 

3  TECOM  Pamphlet  602-1,  Vol  1,  Man-Materiel  Systems  Questionnaire 

and  Interview  Design  (Subjecti ve~Testing  Techniques).  i : 
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Measurement  may  be  thought  of  as  the  process  of  matching  the  char¬ 
acteristics  of  objects  or  entities  with  a  set  of  categories  which  con¬ 
stitute  a  scale,  four  different  scales  of  measurement  may  be  distin¬ 
guished,  based  on  the  nature  of  the  categories  and  the  relationships 
among  the  categories  constituting  the  measurement  scale. ^  The  names 
given  to  these  four  scales  of  measurement  are:  nominal,  ordinal,  in¬ 
terval  and  ratio  scales.  Quality  of  measurement  depends  on  the  kind 
of  measurement  scale  used. 

Measurement  on  a  nominal  scale  involves  describing  objects  or  en¬ 
tities  by  sorting  them  into  a  set  of  categories  about  which  one  can 
say  only  that  they  differ  from  each  other.  This  is  the  lowest  form  of 
measurement.  Sorting  a  bowl  of  mixed  fruit  into  apples,  oranges, 
pears  and  grapes  is  an  example  of  measurement  on  the  nominal  scale. 
Classifying  persons  as  male  or  female  is  another  example. 

Measurement  on  an  ordinal  scale  involves  describing  objects  or 
entities  by  sorting  them  into  a  set  of  categories  which  not  only  dif¬ 
fer  from  one  another,  but  also  have  some  kind  of  natural  order  inher¬ 
ent  in  the  categories.  Ranking  various  fruits  on  the  basij.  of  their 
sweetness  or  sourness  is  an  example  of  ordinal  measurement.  Assigning 
grades  to  students  on  the  basis  of  the  number  of  correct  answers  on 
the  final  exam  is  another  example. 

Measurement  on  an  interval  scale  involves  describing  objects  or 
entities  by  sorting  them  into  a  set  of  categories  so  that  the  catego¬ 
ries  are  different  from  one  another,  are  ordered  in  some  natural  man¬ 
ner,  and  the  intervals  between  adjacent  categories  constituting  the 
scale  are  equal.  The  Fahrenheit  and  Celsius  (centigrade)  temperature 
scales  are  examples  of  interval  scales,  in  that  equal  intervals  or 
units  of  temperature  are  measured  by  equal  volumes  of  expansion.  In 
both  cases,  arbitrary  zero  points  are  designated  which  do  not  denote 
the  total  absence  of  heat.  Some  score  scales  for  achievement  tests 
are  interval  scales,  (those  based  on  percentiles  or  deciles),  if  one 
accepts  as  legitimate  the  basis  for  equalization,  which  is  that  the 
proportion  of  the  population  falling  in  any  scaled  score  interval  is 
equal  to  that  falling  in  any  other  numerically  equal  scaled  score 
interval.  Again,  zero  points  on  such  score  scales  do  not  denote  a 
total  lack  of  ability. 

Measurement  on  a  ratio  scale  involves  describing  objects  or  enti¬ 
ties  by  sorting  them  into  a  set  of  categories  so  that  the  categories 
are  different  from  one  another,  are  ordered  in  some  natural  manner, 
the  intervals  between  adjacent  categories  constituting  the  scale  are 
equal,  and  one  of  the  categories  is  a  natural  zero  point  for  the 
scale,  denoting  the  total  absence  of  the  attribute  being  measured. 
The  existence  of  a  natural  zero  point  for  a  scale  makes  it  possible  to 


4  Stevens,  S.  S.,  ed.,  "Mathematics,  Measurement  and  Psychophysics," 
Handbook  of  Experimental  Psychology,  1951. 
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form  meaningful  ratios,  and  thus  to  make  statements  such  as  "Quantity 
A  is  half  of  quantity  B,"  or  "Quantity  A  is  x  percent  of  quantity  B." 
This  is  the  highest  form  of  measurement.  The  Kelvin  temperature  scale 
is  an  example  of  a  ratio  scale,  since  its  zero  point  (which  has  never 
been  achieved)  corresponds  to  the  complete  absence  of  heat.  The  basic 
physical  scales,  such  as  length,  weight  and  electrical  resistance  are 
also  ratio  scales. 

As  stated  above,  current  methods  for  gathering  subjective  test 
data  yield  only  ordinal  data.  The  objective  of  the  project  described 
in  this  report  is  to  develop  methods  for  measuring  subjective  test 
data  on  a  ratio  scale,  and  to  investigate  the  practical  problems  of 
using  ratio  scale  measurement  in  human  factors  evaluations  of  subjec¬ 
tive  variables.  It  will  still  be  necessary  to  use  suitable  experi¬ 
mental  designs  to  control  unanticipated  variation,  and  to  be  quite 
precise  in  communicating  to  soldiers  the  meaning  of  the  suojective 
variables  on  which  they  are  10  provide  data. 
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SEr-nor.  ul.  BACKGROUND 


Subjective  responses  have  been  most  carefully  and  extensively 
studied  in  the  area  of  sensation.  For  well  over  100  yea.  s,  psycholo¬ 
gists  and  physicists  have  attempted  to  measure  the  intensity  of  sensa¬ 
tions,  and  to  relate  these  measurements  to  the  intensity  of  the  physi¬ 
cal  stimuli  which  arouse  the  sensations.  For  most  of  this  period, 
study  was  concentrated  on  determining  absolute  and  differential 
thresholds  for  various  sensory  modalities  such  as  vision,  hearing, 
taste  and  touch.  An  absolute  threshold  is  the  least  intense  physical 
stimulus  that  will  reliably  arouse  a  subjective  response  or  a  sensa¬ 
tion.  A  differential  threshold  is  the  smallest  difference  between  two 
physical  stimuli  which  can  be  reliably  recognized  as  producing  differ¬ 
ent  subjective  responses.  The  search  for  an  absolute  threshold  is,  of 
course,  a  search  for  a  zero  point  on  which  to  anchor  a  scale  for  meas- 
■  i f.g  sensation.  And  the  search  for  differential  thresholds,  or  "just 
noticeable  differences,"  is  a  search  for  units  with  which  to  construct 
scales  for  measuring  sensations. 

As  elements  to  use  in  fashioning  scales  for  measuring  subjective 
responses,  absolute  and  differential  thresholds  have  not  been  com¬ 
pletely  satisfactory.  A  great  many  studies  of  absolute  sensory 
thresholds  have  shown  that  subjective  responses  to  weak  physical  stim¬ 
uli  a^e  shifting  and  variable,  and  that  there  is  a  zone  of  uncertainty 
between  a  stimulus  that  is  clearly  too  weak  to  arouse  a  subjective 
response  and  one  that  is  definitely  strong  enough  to  arouse  a  subjec¬ 
tive  response.  Likewise,  studies  of  differential  thresholds  have  re¬ 
vealed  a  zone  of  uncertainty  between  stimulus  differences  that  are 
clearly  too  small  to  arouse  recognizably  different  subjective  respons¬ 
es,  and  stimulus  differences  that  are  definitely  large  enough  to 
arouse  recognizably  different  subjective  responses.  Thresholds  have 
been  determined,  then,  by  arbitrary  statistical  methods  of  dividing 
these  zones  of  uncertainty  and  are  thus  derived  from  unstable  and 
fluctuating  judgments  (Stevens,  1951). 

During  the  last  25  years  substantial  progress  has  been  made  toward 
improving  the  measurement  of  sensation  (Stevens,  1975). 5  It  has 
been  determined  that  people  can  easily  and  confidently  make  cross¬ 
modality  matches,  such  as  adjusting  the  brightness  of  a  light  to  match 
the  loudness  of  sounds  presented  by  the  experimenter.  The  sounds 
should  be  chosen  to  cover  a  substantial  part  of  the  range  between  very 
faint  sounds  and  very  loud  sounds.  The  data  from  such  an  experiment 
consist  of  the  loudness  values  of  the  stimulus  sounds,  expressed  in 
physical  terms;  and  the  brightness  values,  in  physica1  terms,  obtained 
by  having  the  light  adjusted  by  the  observer.  When  the  brightness 
values  are  plotted  against  the  loudness  values  on  log-log  coordinates, 
the  points  fall  very  nearly  on  a  straight  line.  (An  equivalent  method 
is  to  convert  both  brightness  data  and  the  loudness  values  to 


5  Stevens,  S.  S.,  Psychophysics ,  1975. 
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decibels  and  plot  tlie  points  on  ordinary  linear  graph  paper.)  This 
cells  as  that  the  subjective  brightness  (\|>)  of  the  light  is  a 
.  'wa--  function  of  tne  objective  loudness  (<j>)  of  the  sound  stimuli: 

=  ty>.  Ir  we  take  the  logarithm  of  each  side  of  this  equation 
i.wnich  is  the  analytical  equivalent  of  plotting  the  relationship  on 
log-log  coordinates),  we  obta  n:  1  og\p  =  nlog<t>.  Thus,  we  see  that 
the  exponent  n  is  the  Siope  of  the  straight  line  which  is  obtained 
when  the  brightness  data  are  plotted  against  the  loudness  values  on 
log-log  coordinates. 

Over  the  last  25  years,  there  has  been  established  an  extensive 
and  interwoven  set  of  these  power  functions  which  relate  various  sen¬ 
sory  continua  to  each  other.  The  relationship  between  any  sensory 
continuum  and  any  other  sensory  continuum  has  been  found  to  be  a  power 
.  ifiction,  defined  by  the  value  of  tne  specific  exponent,  n,  for  that 
Di^tOcular  relationship.  Among  the  sensory  continua  involved  have 
:e-n:  loudness  of  sound,  brightness  of  light,  60-hertz  vibration  on 

'.he  skin,  60-hertz  electric  current  through  the  fingers*  (with  a  cur¬ 
rent  level  high  enough  to  produce  sensation  but  below  the  levels  that 
would  produce  pain  or  "shock"),  handgrip  force,  warmth  on  the  arm, 
Heaviness  of  lifted  weights,  pressure  on  palm  of  hand,  cold  on  the 
arm,  redness  (or  saturation)  of  color,  roughness  of  emery  cloth  on  the 
uin,  length  of  lines,  hardness  of  rubber  balls  squeezed,*  sweetness, 
saltiness,  sourness,  and  bitterness  of  taste,  and  number  or  numerosi- 
ty.  The  last  of  these  continua,  number  or  numerosity,  must  be  ex¬ 
plained  further.  Data  on  the  relationship  of  this  continuum  to  any 
sensory  continuum  are  obtained  by  presenting  observers  with  stimuli, 
such  as  sounds  of  various  loudness,  and  asking  them  to  produce  numbers 
describing  the  loudness  of  the  sounds--the  louder  the  sound,  the 
larger  the  number.  This  procedure  is  called  magnitude  estimation.  It 
is  important  that  the  observers  not  be  given  any  guidance  on  the  scale 
to  be  used,  such  as,  "Rate  on  a  scale  from  1  to  10  the  loudness  of 
these  sounds."  If  observers  are  given  such  guidance,  they  will  appor¬ 
tion  the  provided  scale  numbers  to  cover  the  range  of  whatever  percep¬ 
tual  continuum  they  are  dealing  with,  and  the  result  will  be  measure¬ 
ment  (at  best)  on  the  interval  scale  (Stevens,  1975,  pp.  134-139). 

The  interwoven  set  of  power  functions  referred  to  in  the  last 
paragraph  exists  because  relationships  between  sensory  continua  have 
been  found  to  be  transitive  (Stevens,  1975,  pp.  100-107).  The  results 
presented  in  table  1  illustrate  this  transitivity. 

The  data  in  table  1  show  that  if  loudness  is  matched  with  vibra¬ 
tion,  and  loudness  with  shock,  and  exponents  obtained  for  these  power 
functions,  it  can  be  predicted  that  the  exponent  obtained  when  vibra¬ 
tion  is  matched  with  shock  is:  8.46  t  1.71  =  4.95,  as  compared  with 
the  experimentally  determined  exponent  of  5.00.  Further,  if  vibration 
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Data  from  these  two  sensory  continua  are  not  so  well  fitted  by  a 
power  function  as  the  other  continua,  for  as  yet  unknown  reasons. 


Table  1.  Experimental  Validation  of  Transitivity 


Experimentally 

Description  of  Experiment  Determined  Exponent 


Predicted 

Exponent 


Matching  Loudness  with  Vibration  1.71 
Matching  Loudness  with  Shock  8.46 
Matching  Vibration  with  Shock  5.00 


(1.69) 

(4.95) 


is  matched  with  shock,  and  loudness  «un  shock,  and  exponents  obtained 
for  these  power  functions,  the  exponent  obtained  can  be  predicted  when 
loudness  is  matched  to  vibration:  8.46  5.00  =  1.69,  as  compared 

with  the  experimentally  determined  exponent  of  1.71. 

Extensive  experimental  exploration  of  transitivities  among  power 
function  relationships  between  sensory  continua  has  led  to  the  gener¬ 
alization:  from  the  exponents  obtained  by  experimentally  matching  any 
two  sensory  continua  with  a  third  sensory  continuum,  the  exponent  for 
the  power  function  relating  two  sensory  continua  can  be  predicted. 
Thus  the  characterization:  "interwoven  set  of  power  functions." 

Because  the  exponent  obtained  for  a  given  sensory  continuum  var¬ 
ies,  depending  on  the  sensory  continuum  against  which  it  is  matched 
(note  the  two  different  experimentally  determined  exponents  for  loud¬ 
ness  in  table  1),  it  is  necessary  to  select  a  reference  continuum. 
Then  exponents  for  all  other  sensory  continua  may  be  expressed  in 
terms  of  the  reference  continuum,  which  by  definition  is  assigned  an 
exponent  of  1.00.  The  number  or  numerosity  continuum  has  been  widely 
accepted  as  the  reference  continuum.  It  is  convenient  because  people 
are  almost  universally  familiar  with  it  and,  furthermore,  many  of  the 
basic  measuring  scales  of  physics  such  as  length  and  mass  are  linear 
(exponent  =  1.00)  against  number.  Thus,  magnitude  estimation  has  be¬ 
come  a  widely  used  technique  in  the  measurement  of  sensation. 

The  idea  has  been  frequently  challenged  that  legitimate  measure¬ 
ment  is  achieved  simply  by  having  people  emit  numbers  in  response  to 
physical  stimulation  of  different  kinds  or  intensities.  Further,  the 
assertion  that  this  method  produces  measurement  on  a  ratio  scale  has 
been  hard  for  many  people  to  accept.  It  is  true  that  the  necessary 
elements  of  a  natural  zero  point  and  equality  of  units  are  not  intui¬ 
tively  obvious  in  magnitude  estimation,  as  they  are  for  basic  physical 
measurements  of  length  and  mass;  however,  laboratory  data  obtained  by 
the  magnitude  estimation  technique,  when  subjected  to  the  treatments 
appropriate  for  ratio  scale  data  such  as  the  geometric  mean  and  logar¬ 
ithmic  transformations,  have  been  very  useful  in  measuring  sensation 
in  the  field  of  psychophysics. 

Magnitude  estimation  has  also  been  used  on  a  wide  variety  of  sub¬ 
jective  dimensions  in  other  areas  of  psychology  and  in  political 
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science,  sociology  and  criminology.  Among  the  successful  applications 
of  magnitude  estimation  have  been  studies  on  attitude  toward  religion, 
preference  for  wristwatches,  judged  quality  of  handwriting  and  draw¬ 
ings,  esthetic  judgments  of  music,  intensity  and  pleasantness  of 
odors,  judgments  of  masculinity  and  femininity,  a  political  dissatis¬ 
faction  scale,  judged  prestige  of  occupations,  judgments  of  social 
status,  perceptions  of  national  power,  scales  of  national  conflict  and 
cooperation,  judged  seriousness  of  crimes  generally  and  of  thefts  of 
various  amounts  of  money,  and  estimates  of  word  frequency  (Stevens, 
1975,  Chapter  8). 


SECTION  IV.  LABORATORY  STUDIES  OF  MAGNITUDE  ESTIMATION 


At  the  US  Army  Tropic  Test  Center,  several  laboratory  studies  of 
magnitude  estimation  were  carried  out  before  attempting  use  of  the 
method  in  field  studies.  These  laboratory  studies  served  as  pilot 
tests  to  insure  that  the  technique  was  usable.  They  actually  consti¬ 
tuted  a  calibration  step  to  see  if  power  functions  could  be  obtained 
with  exponents  similar  to  those  obtained  by  other  investigators. 

As  a  simple  laboratory  method  of  studying  subjective  variables, 
magnitude  estimation  of  the  lengths  of  lines  was  used.  The  reverse 
procedure,  having  subjects  draw  lines  judged  to  represent  (by  their 
lengths)  the  size  of  numbers  presented  to  them  (line  production),  was 
used  also. 

A.  MAGNITUDE  ESTIMATION  OF  LENGTHS  OF  LINES 

1.  Experimental  Materials  and  Method:  First  Experiment 

Lines  of  1/8,  1  1/8,  2  1/8,  3  1/8,  4  1/8,  5  1/8,  6  1/8  and  7  1/8 
inches  were  drawn,  each  line  on  a  sheet  of  8-  by  10  1/2-inch  paper. 
Thirty-six  copies  of  each  of  these  eight  sheets  of  paper  were  then 

reproduced  to  be  used  as  stimuli  in  this  experiment.  For  each  of  12 
subjects,  three  sets  of  one  each  of  the  eight  lines  of  different 

lengths  were  selected.  Each  of  these  36  sets  of  eight  sheets  of  paper 
was  then  arranged  in  random  order  independently.  After  each  subject 
had  been  given  instructions,  he/she  was  presented,  one  at  a  time,  with 
the  24  sheets  of  paper  which  constituted  three  sets.  Thus,  each  sub¬ 
ject  was  presented  with  one  set  of  the  eight  lines  of  different 

lengths  in  random  order  as  a  first  trial;  a  second  set  of  the  eight 

lines  in  a  different,  independent  random  order  as  a  second  trial;  and 
a  third  set  in  a  different  independent  random  order  as  a  third  trial. 
For  12  subjects,  then,  36  different,  independent  random  orders  of  pre¬ 
sentation  of  the  eight  lines  of  different  lengths  were  prepared. 

As  a  general  introduction  to  both  magnitude  estimation  of  line 
length  and  line  production  in  response  to  numbers,  the  following  in¬ 
structions  were  given  to  each  subject: 

We're  doing  some  research  on  subjective  rating  scales,  and  we 
want  you  to  help  us.  Subjective  rating  means  telling  how  you 
feel  about  something,  such  as:  how  good  a  pair  of  shoes 
fits,  how  comfortable  a  helmet  feels,  or  how  easy  (or  how 
difficult)  it  is  to  adjust  the  straps  on  a  pack.  It's  hard 
to  get  very  good  measurements  of  this  kind  of  thing  and  we're 
trying  to  improve  the  methods  used  in  subjective  ratings  of 
many  kinds  of  equipment  that  we  test  for  the  Army. 

Then  the  following  instructions  for  magnitude  estimation  of  line 
length  were  given  to  each  subject: 


You  Mill  be  presented  with  a  series  of  lines  of  various 
lengths.  Your  task  Is  to  tell  how  long  the  lines  seem  to  you 
by  assigning  numbers  to  them.  Assign  the  first  line  any  num¬ 
ber  that  seems  appropriate  to  you.  Then  assign  larger  or 
smaller  numbers  to  the  other  lines  depending  on  how  long  they 
appear  to  you.  You  can  use  any  numbers  you  want:  large, 
small,  whole  numbers,  decimals,  or  fractions;  but  please  do 
not  use  zero  or  negative  numbers.  Also,  you  shouldn't  think 
of  the  lines  as  being  so  many  Inches  or  centimeters  long. 

Try  to  make  each  number  match  the  length  of  the  line  as  It 
appears  to  you.  Please  write  the  number  you  choose  In  the 
space  In  the  lower  right  hand  corner  of  each  page. 

The  12  subjects  were  volunteers  from  the  staff  of  US  Army  Tropic 
Test  Center,  both  men  and  women,  and  both  military  and  civilian. 

2.  Analysis  and  Results:  First  Experiment 

The  magnitude  estimates  for  each  trial  were  arranged  in  eight  col¬ 
umns,  one  column  for  each  of  the  various  line  lengths,  and  12  rows  for 
the  12  subjects.  Geometric  means  of  the  magnitude  estimates  were  com¬ 
puted  over  the  12  subjects  for  each  of  the  line  lengths  for  each 
trial,  and  for  each  line  length  for  all  three  trials  combined.* 

The  lengths  of  the  eight  lines  were  converted  to  ratios,  using  the 
length  of  the  shortest  line  as  the  base  for  the  ratios.  Similarly, 
the  eight  geometric  means  of  the  magnitude  estimation  responses  for  12 
subjects  over  all  three  trials  combined  were  converted  to  ratios,  us¬ 
ing  the  first  geometric  mean  (of  the  magnitude  estimates  for  the 
shortest  line)  as  the  base  for  these  ratios.  Then  both  of  these  sets 
of  ratios  were  converted  to  decibels  by  taking  the  common  logarithm  of 
each  ratio  and  multiplying  it  by  10.** 

The  logarithms  of  the  ratios  could  have  been  used  without  convert¬ 
ing  to  decibels,  but  following  the  conventions  established  by  Stevens 
(1975),  the  decibel  unit  Is  used  here.  Using  the  length  of  the  short¬ 
est  line  and  using  the  geometric  mean  of  magnitude  estimates  of  the 
shortest  line  as  bases  for  the  conversions  to  ratios  results  in  the 
first  point  being  at  the  origin  of  the  coordinate  system,  when  the 
points  are  plotted  on  decibel  scales.  Taking  ratios  and  converting 
them  to  decibels  (or  logarithms)  makes  It  possible  to  plot  a  power 
function  as 


7  The  geometric  mean  of  n  numbers  Is  obtained  by  multiplying  all  n 
numbers  together,  and  tfTen  taking  the  _nth  root  of  the  product. 

**  Because  length  of  line  and  number  are  not  obviously  analogous 
either  to  power  or  to  voltage  and  current.  It  was  arbitrarily  de¬ 
cided  to  define  the  decibel  scale,  for  present  purposes,  as  10 
times  the  common  logarithm  of  the  ratio  of  the  lengths  of  two 
lines,  or  of  two  numbers. 
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a  straight  line  on  linear  graph  paper.  The  lengths  of  the  eight 
lines,  the  eight  geometric  means  for  12  subjects  over  all  three 
trials,  and  the  ratios  and  decibel  values  obtained  from  them  are  shown 
In  table  2. 


Table 

2.  Ratios 

and  Conversions 

to  Decibels:  Magnitude  Estimates. 

First  Experiment 

Stimuli 

Response 

Lengths 
of  Lines 

Ratios 

Decibels 

Geom  Means 
of  Mag  Est 

Ratios 

Decibels 

1/8  in 

1 

0.00 

0.514 

1.000 

0.00 

1  1/8  In 

9 

9.54 

3.956 

7.696 

8.86 

?.  1/8  In 

17 

12.30 

6.967 

13.554 

11.32 

3  1/8  In 

25 

13.98 

9.413 

18.313 

12.63 

4  1/8  In 

33 

15.19 

13.073 

25.434 

14.05 

5  1/8  In 

41 

16.13 

17.278 

33.615 

15.27 

6  1/8  In 

49 

16.90 

19.707 

38.340 

15.84 

7  1/8  In 

57 

17.56 

23.380 

45.486 

1(6 . 58 

The  magnitude  estimates  are  plotted  against  the  lengths  of  lines 
In  figure  1,  using  the  decibel  figures  from  table  2.  The  straight 
line  drawn  In  figure  1  was  fitted  by  least  squares  to  the  eight  points 
shown.  The  slope  of  this  line  Is  .94,  which  Is  reasonably  close  to 
the  value  of  1.00  by  Stevens  (1975). 

It  can  be  seen  In  figure  1  that  the  eight  points  lie  very  nearly 
on  the  straight  line,  as  they  should  If  magnitude  estimation  Is  a  pow¬ 
er  function  of  line  length. 

B.  LINE  PRODUCTION  IN  RESPONSE  TO  NUMBERS 

1.  Experimental  Materials  and  Method:  First  Experiment 

The  numbers  1,  3,  10,  30,  100,  315,  1,000,  3,150  and  10,000  were 
chosen  as  stimuli  on  the  basis  that  their  logarithms  are  (approx¬ 
imately)  evenly  distributed  over  the  range,  0  to  4:  0,  0.5,  1.0,  1.5, 

.  .  .,  4.0.  Each  of  these  nine  numbers  was  written  on  an  8-  by  10 
1/2-  Inch  sheet  of  paper,  and  36  copies  were  reproduced  of  each  of 

these  nine  sheets  of  paper.  In  the  same  manner  as  was  done  for  the 
experiment  on  magnitude  estimation  of  the  length  of  lines,  three  sets 
of  one  each  of  the  nine  numbers  were  selected  for  each  of  the  12  sub¬ 
jects.  Each  of  these  36  sets  of  nine  sheets  of  paper  was  then  ar¬ 
ranged  In  random  order  Independently.  After  each  subject  had  been 
given  Instructions;  he/she  was  presented,  one  at  a  time,  with  the  27 

sheets  of  paper  (numbers)  which  constituted  three  sets.  Thus,  each 

subject  was  presented  with  one  set  of  the  nine  numbers  In  random  order 
as  a  first  trial;  a  second  set  of  the  nine  numbers  in  a  different, 
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Figure  1.  Magnitude  Estimation  of  Lengths  of  Lines.  First  Experi¬ 
ment:  Twelve  Subjects,  Three  Trials  on  Each  of  Eight 
Line  Lengths.  Best-fitting  Line  (least  squares): 
Y  =  -.09  +  . 94X. 


independent  random  order  as  a  second  trial;  and  a  third  set  in  a  dif¬ 
ferent,  independent  random  order  as  a  third  trial.  For  12  subjects, 
then,  36  different,  independent  random  orders  of  the  nine  numbers  were 
prepared. 

The  subjects  were  given  the  same  "general  introduction"  instruc¬ 
tions  as  were  quoted  in  the  last  section  on  magnitude  estimation,  and 
then  were  given  the  following  specific  instructions  for  the  line  pro¬ 
duction  experiment: 

You  will  be  presented  with  numbers  ranging  from  1  to  10,000. 

Your  task  is  to  draw  a  line  for  each  number  so  that  the 
length  of  the  line  represents  the  size  of  the  number.  Draw 
the  line  from  left  to  right  across  the  page.  Make  the  line 
as  long  as  you  think  the  number  is  large.  Try  not  to  think 
of  inches  or  centimeters,  or  any  other  units  of  length. 

The  subjects  were  given  a  plastic  straight-edge,  which  was  not 
marked  with  a  scale  of  any  kind.  The  12  subjects  were  volunteers  from 
the  staff  of  US  Army  Tropic  Test  Center;  both  men  and  women,  and  both 
military  and  civilian.  None  of  this  group  of  12  subjects  were  persons 
who  had  been  subjects  for  the  first  magnitude  estimation  experiment. 
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2.  Analysis  and  Results:  First  Experiment 

The  lines  drawn  by  the  subjects  in  response  to  the  numbers  were 
measured  and  their  lengths  (in  millimeters)  recorded  in  nine  columns 
and  12  rows  in  the  same  manner  as  was  done  with  the  magnitude  estima¬ 
tion.  Geometric  means  of  the  line  lengths  were  computed  over  the  12 
subjects  for  each  of  the  nine  stimulus  numbers  for  each  trial  and  for 
each  number  for  all  three  trials  combined. 

The  nine  numbers  used  as  stimuli  were  converted  to  ratios  and  then 
to  decibels.  Similarly,  the  geometric  means  of  the  line  lengths  for 
all  three  trials  combined  were  converted  to  ratios,  and  then  to  deci¬ 
bels,  as  was  done  in  the  magnitude  estimation  experiment.  The  nine 
numbers,  the  nine  geometric  means  for  12  subjects  over  all  three 
trials,  and  the  ratios  and  decibel  values  obtained  from  them  are  shown 
in  table  3. 


Table  3.  Ratios  and  Conversion  to  Decibels: 

Line  Production,  First  Experiment 


Stimul  i _  _ Responses 


Numbers 

Ratios 

Decibels 

Geom  Means 
of  Line  Lengths 

Ratios 

Decibel 

1 

1 

0.00 

1.133 

in 

1.000 

0.00 

3 

3 

4.77 

2.868 

in 

2.531 

4.03 

10 

10 

10.00 

5.160 

in 

4.554 

6.58 

30 

30 

14.77 

8.458 

in 

7.465 

8.73 

100 

100 

20.00 

13.445 

in 

11.867 

10.74 

315 

315 

24.98 

22.983 

in 

20.285 

13.07 

1,000 

1,000 

30.00 

48.245 

in 

42.582 

16.29 

3,150 

3,150 

34.98 

94.253 

in 

83.189 

19.20 

10,000 

10,000 

40.00 

225.982 

in 

199.455 

23.00 

The  geometric  means  of  the  line  lengths  are  plotted  against  the 
numbers,  both  being  expressed  in  decibels,  in  figure  2.  The  straight 
line  drawn  in  figure  2  was  fitted  by  least  squares  to  the  nine  points 
shown.  It  can  be  seen  that  the  nine  points  lie  quite  close  to  the 
line  as  they  should  if  line  production  is  a  power  function  of  the  num¬ 
bers  used  as  stimuli.  However,  the  slope  of  this  line  is  only  .53, 
which  is  considerably  less  than  the  .94  slope  obtained  when  magnitude 
estimates  were  plotted  against  line  lengths.  This  is  an  example  of 
the  regression  effect  which  may  be  described  as  a  tendency  of  subjects 
to  restrict  the  range  of  the  variable  they  control,  i.e.,  numbers  pro¬ 
duced  in  the  case  of  magnitude  estimation  and  line  length  in  the  case 
of  line  production  (Stevens,  1975,  pp.  271-281).  However,  the  differ¬ 
ence  in  slopes  (exponents)  seems  rather  large.  For  this  reason,  fur¬ 
ther  experiments  were  undertaken,  first  with  line  production  in  re¬ 
sponse  to  numbers,  since  it  was  the  exponent  obtained  in  thi:  experi¬ 
ment  which  seemed  to  differ  so  much  from  the  expected  value. 
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Figure  2.  Line  Production  in  Response  to  Numbers.  First  Experiment: 

Twelve  Subjects,  Three  Trials  on  Each  of  Nine  Numbers. 
Best-fitting  Line  (least  squares):  Y  =  .61  +  .53X. 


3.  Experimental  Materials  and  Method:  Second  Experiment 

The  range  of  numbers  used  as  stimuli  in  the  first  experiment  was 
quite  great  (1  to  10,000,  or  40  decibels),  but  the  subjects  were  lim¬ 
ited  in  the  length  of  line  they  could  draw  to  about  267  millimeters  by 
the  10  1/2-inch  length  of  the  sheet  of  paper.  Therefore,  it  was  hy¬ 
pothesized  that  this  restriction  had  prevented  the  subjects  from  vary¬ 
ing  the  lengths  of  lines  drawn  over  a  range  comparable  to  the  range  of 
numbers  used  as  stimuli.  The  line  production  experiment  was  repeated, 
using  pieces  of  paper  14  15/16  inches  long,  which  permitted  the  sub¬ 
jects  to  draw  lines  as  long  as  380  millimeters.  The  same  numbers  were 
used  as  stimuli  as  in  the  first  experiment.  Again,  subjects  were  12 
volunteers  from  the  US  Army  Tropic  Test  Center  staff,  different  per¬ 
sons  from  those  who  served  as  subjects  for  the  first  experiment. 

4.  Analysis  and  Results:  Second  Experiment 

When  the  results  of  the  second  experiment  were  analyzed  and  plot¬ 
ted  in  the  same  fashion  as  those  of  the  first  experiment  in  line  pro¬ 
duction,  the  resulting  plot  appeared  as  shown  in  figure  3.  Again, 
the  points  fall  very  nearly  on  a  straight  line,  but  the  slope  of  this 
line  is  only  .50.  Therefore,  the  hypothesis,  that  restricted  space 
for  line  drawing  lowered  the  exponent  of  the  power  function  relating 
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Figure  3.  Line  Production  in  Response  to  Numbers.  Second  Experi¬ 
ment:  Twelve  Subjects,  Three  Trials  on  Each  of  Nine  Num¬ 
bers.  Best-fitting  Line  (least  squares):  Y  =  1.01  +  .50X. 


line  production  to  numbers,  was  rejected  and  a  third  experiment  was 
performed. 

5.  Experimental  Materials  and  Method:  Third  Experiment 

In  the  second  experiment,  the  ratio  of  the  largest  number  used  as 
a  stimulus  to  the  smallest  number  was  10,000: :1,  thus  yielding  a  40- 
decibel  range  on  the  number  scale  in  figure  3.  But,  the  ratio  of  the 
geometric  mean  of  line  lengths  produced  in  response  to  the  largest 
number,  to  the  geometric  mean  of  line  lengths  pr.  Juced  in  response  to 
the  smallest  number  was  only  169.88: :1,  yieldinq  only  a  22.30-decibel 
range  on  the  line  production  scale  in  figure  3.  Therefore,  in  the 
third  experiment  an  attempt  was  made  to  reduce  this  difference  between 
the  two  ratios  by  using  as  stimuli  the  set  of  numbers:  1,  2,  4,  8, 
16  ,  32  ,  64,  128,  and  256;  which  has  a  ratio  between  largest  and  smal¬ 
lest  of  only  256: :1,  yielding  a  24.08-decibel  range  on  the  number 
scale.  The  experiment  was  administered  in  the  same  fashion  as  the 
first  experiment  on  line  production,  with  the  exception  that  the  in¬ 
structions  to  the  subjects  were  modified  to  specify  a  range  of  numbers 
from  1  to  256.  The  13  subjects  were  volunteers  from  the  US  Army  Trop¬ 
ic  Test  Center  staff;  men  and  women,  military  and  civilian. 
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6.  Analysis  and  Results:  Third  Experiment 

The  results  of  the  third  experiment  were  analyzed  and  plotted  in 
the  same  fashion  as  those  of  the  first  experiment  in  line  production, 
producing  the  plot  shown  in  figure  4.  The  points  fall  very  nearly  on 
a  straight  line,  and  the  slope  of  the  line  this  time  is  .71,  which  is 
considerably  closer  to  the  value  of  .94  obtained  from  magnitude  esti¬ 
mation  in  response  to  line  length.  The  decibel  value  for  the  geomet¬ 
ric  mean  of  the  lengths  of  lines  drawn  in  response  to  the  largest  num¬ 
ber  declined  from  22.30  in  the  second  experiment  (in  which  the  decibel 
value  for  the  largest  number  was  40.00)  to  18.42  in  the  third  experi¬ 
ment  (in  which  the  decibel  value  for  the  largest  number  was  24.08). 
Thus,  the  decrease  in  the  range  of  decibel  values  for  line  production 
between  the  two  experiments  was  relatively  less  than  the  decrease  in 
range  of  decibel  values  produced  by  reducing  the  range  .of  stimulus 
numbers  to  256 : :  1 ,  and  an  increased  slope  for  the  best  fitting  line 
results. 
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Figure  4.  Line  Production  in  Response  to  Numbers.  Third  Experiment: 

Thirteen  Subjects,  Three  Trials  on  Each  of  Nine  Numbers. 
Best-fitting  Line  (least  squares):  Y  =  1.05  +  . 71X . 


C.  CONFIRMATORY  EXPERIMENTS 

At  this  point  in  the  investigation,  two  confirmatory  experiments 
were  performed,  one  with  magnitude  estimation  and  one  with  line  pro¬ 
duction. 
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1.  Magnitude  Estimation;  Experimental  Materials  and  Method 

As  stimuli  for  this  experiment,  10  lines  of  .12  (1/8),  .22,  .40, 
.70,  1.25,  2.22,  3.95,  7  03  12.50  and  22.23  inches  in  length  were 

chosen.  The  basis  for  choosing  lines  of  these  particular  lengths  was 

that  when  they  are  converted  to  decibels  using  .125  (1/8)  inch  as  the 

base  for  ratios,  a  series  of  equally  spaced  numbers  would  be  ob¬ 

tained:  0,  2.5,  5.0,  .  .  .,  20.0,  22.5  decibels.  Also,  the  range  on 
the  length  of  lines  variable  is  extended  to  22.5  decibels,  compared 
with  17.6  decibels  in  the  first  experiment  on  magnitude  estimation. 

Ten  pieces  of  6-  by  24- inch  poster  board  were  selected,  and  lines, 
one  line  per  board,  of  the  lengths  described  above  were  drawn  with  a 
pen  which  produced  a  line  approximately  1  millimeter  wide.  On  the 
back  of  each  piece  of  poster  board  a  number  was  written:  1  for  the 
board  on  which  the  .125-inch  line  was  drawn,  2  for  the  board  on  which 
the  .22-inch  line  was  drawn,  .  .  .,  and  10  for  the  board  on  which  the 
22. 23- inch  line  was  drawn. 

Thirty-six  independent,  random  orders  of  the  numbers  1,  2,  3,  4, 
5,  6,  7,  8,  9,  and  10  were  prepared  from  random  number  tables.  This 
provided  for  three  trials  for  each  of  12  subjects.  The  subjects  were 
given  the  same  "general  introduction"  instructions  as  in  the  earlier 
laboratory  experiments,  followed  by  the  instructions  for  magnitude 
estimation  of  line  length  and  modified  to  request  that  the  subjects 
write  their  magnitude  estimations  on  a  data  sheet,  rather  than  "in  the 
lower  right  hand  corner  of  each  page."  The  12  subjects  were  volun¬ 
teers  from  the  staff  of  US  Army  Tropic  Test  Center,  both  military  and 
civilian.  By  coincidence  all  12  subjects  were  male. 

In  working  with  each  subject,  the  experimenter  first  arranged  the 
10  pieces  of  poster  board  in  the  randan  order  for  the  first  trial  for 
that  subject,  then  showed  the  subject  the  lines,  one  at  a  time,  with 
instructions  to  write  his  magnitude  estimation  of  the  length  of  the 

line  on  a  data  sheet  before  showing  him  the  next  line.  After  all  10 

lines  had  been  shown  to  the  subject  on  the  first  trial,  the  experi¬ 
menter  arranged  the  10  pieces  of  poster  board  in  the  order  for  the 

second  trial  for  that  subject,  and  proceeded  to  show  them  to  the  sub¬ 
ject  in  the  same  manner  as  had  been  done  on  the  first  trial.  The  pro¬ 

cedures  for  the  third  trial  were  the  same  as  for  the  first  two  trials. 

2.  Magnitude  Estimation:  Analysis  and  Results 

The  magnitude  estimates  were  arranged  in  10  columns,  one  for  each 

of  the  stimulus  lines,  with  estimations  for  12  subjects  in  each  col¬ 
umn.  Geometric  means  of  the  magnitude  estimates  were  computed  over 

the  12  subjects  for  each  of  the  10  line  lengths  for  each  trial,  and 

for  each  line  length,  for  all  three  trials  combined,  as  had  been  done 
in  the  first  magnitude  estimation  experiment.  The  lengths  of  the  10 
lines  and  the  10  geometric  means  of  the  magnitude  estimates  were  con¬ 
verted  to  ratios  and  then  to  decibels,  in  the  same  manner  as.  in  the 


21 


first  magnitude  estimation  experiment.  The  magnitude  estimates  were 
then  plotted  (in  decibels)  against  the  lengths  of  the  stimulus  lines 
(also  in  decibels)  in  figure  5. 


Y 


Figure  5.  Magnitude  Estimation  of  Lengths  of  Lines.  Confirmatory 
Experiment:  Twelve  Subjects,  Three  Trials  on  Each  of  10 
Line  Lengths.  Best-fitting  Line  (least  squares): 
Y  =  -.11  +  . 92X. 


The  slope  of  the  best-fitting  line  in  figure  5  is  .92,  which 
agrees  quite  well  with  the  slope  of  .94  obtained  in  the  first  magni¬ 
tude  estimation  experiment. 

3.  Line  Production:  Experimental  Materials  and  Method 

A  comparison  between  the  second  and  third  line  production  experi¬ 
ments,  previously  described,  shows  that  reducing  the  range  of  the  set 
of  numbers  used  as  stimuli  was  apparently  a  step  in  the  right  direc¬ 
tion  toward  the  objective  of  obtaining  a  power  function  with  an  expo¬ 
nent  nearer  to  1.00  (see  discussion  in  Section  IV,  B,  6,  Analysis  and 
Results:  Third  Experiment).  Therefore,  in  this  confirmatory  experi¬ 
ment  on  line  production  in  response  to  numbers,  a  further  step  in  this 
direction  was  taken  by  reducing  the  range  of  the  set  of  numbers  even 
more  so  that  the  ratio  of  the  largest  number  to  the  smallest  was 
63 : : 1 .  The  following  10  numbers  were  used  as  stimuli:  1,  1.6,  2.5,  4, 
6.3,  10,  16,  25,  40,  and  63.  The  spacing  within  this  set  of  numbers 
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was  chosen  to  produce  decibel  values  at  approximately  equal  inter¬ 
vals:  0,  2,  4,  .  .  18.  This  range,  from  0  to  18  decibels  *  ”  the 

set  of  numbers  used  as  stimuli  in  this  experiment ,  thus  approximates 
the  0  to  18.42  decibels  range  for  length  of  lines  drawn  in  response  to 
numbers  in  the  third  line  production  experiment  previously  described. 

Use  of  the  longer  paper  (allowing  lines  as  long  as  380  millimeters 
to  be  drawn)  was  continued  from  the  second  and  third  line  production 
experiments  previously  described.  The  experimental  materials  were 
prepared  in  a  fashion  analogous  to  the  previous  line  production  exper¬ 
iments.  Each  of  the  10  numbers  was  written  on  three  sheets  of  paper 
for  12  subjects,  and  the  sheets  of  paper  were  then  sorted  into  36  sets 
of  10  sheets,  so  that  for  each  subject  three  sets  of  the  10  numbers 
were  available  for  trials  one,  two  and  three.  Each  of  these  36  sets 
was  then  arranged  in  a  different,  independent  random  order.  The  exper¬ 
iment  was  administered  in  the  same  fashion  as  the  earlier  line  produc¬ 
tion  experiments,  with  the  exception  that  the  instructions  to  the  sub¬ 
jects  were  again  modified,  this  time  to  specify  a  range  of  numbers 
from  1  to  63.  As  before,  the  subjects  were  volunteers  fgom  the  US 
Army  Tropic  Test  Center  staff,  both  military  and  civilian.  None  par¬ 
ticipated  in  the  confirmatory  experiment  on  magnitude  estimation, 
though  some  had  participated  in  one  or  both  types  of  experiments  pre¬ 
viously  carried  out  in  this  project. 

4.  Line  Production:  Analysis  and  Results 

The  results  of  this  confirmatory  experiment  on  line  production  in 
response  to  numbers  were  analyzed  and  plotted  in  the  same  fashion  as 
those  of  earlier  line  production  experiments.  Figure  6  shows  the 
plotted  results.  Again  the  points  fall  very  nearly  on  a  straight 
line,  and  the  slope  of  the  best-fitting  line  is  .78,  which  is  somewhat 
nearer  1.00  than  the  slope  of  .71  obtained  from  the  third  line  produc¬ 
tion  experiment  previously  described.  When  the  slope  of  .78  obtained 
in  this  confirmatory  line  production  experiment  is  compared  with  the 
slope  of  .92  obtained  in  the  confirmatory  magnitude  estimation  experi¬ 
ment,  it  can  be  seen  that  the  regression  effect  (Stevens,  1975,  pp. 

271-281)  is  still  present,  but  is  much  reduced  from  the  corresponding 

comparison  of  .53  with  .94,  obtained  in  the  first  experiments  in  this 
series. 

D.  Stability  of  Magnitude  Estimation  Data 

An  answer  to  the  question  of  how  stable  magnitude  estimation  data 
are  may  be  obtained  by  examining  the  results  of  the  three  trials  sep¬ 
arately.  Because  our  use  of  the  data  yielded  by  magnitude  estimation 
will  be  different  from  the  uses  often  made  of  data  obtained  from  re¬ 
sponses  made  to  psychological  tests  and  rating  instruments,  our  evalu¬ 
ation  of  the  stability  of  magnitude  estimation  data  will  be  different 

from  the  usual  evaluation  of  stability  (or  reliability)  of  the  data 

yielded  by  psychological  tests  and  rating  instruments. 


23 


Y 


Figure  6.  Line  Production  in  Response  to  Numbers.  Confirmatory  Ex¬ 
periment:  Twelve  Subjects,  Three  Trials  on  Each  of  10  Num¬ 
bers.  Best-fitting  Line  (least  squares):  Y  =  .17  +  .78X. 


Data  from  psychological  tests  and  rating  instruments  are  often 
used  to  help  make  decisions  about  individual  persons  whose  responses 
constitute  the  data.  In  this  case,  therefore,  it  is  important  that 
there  be  reasonable  stability  in  the  data  at  the  level  of  the  individ¬ 
ual  person.  Magnitude  estimation  is  being  proposed  here  as  a  tech¬ 
nique  for  providing  data  to  aid  in  decisions,  not  about  the  individual 
persons  whose  responses  constitute  the  data,  but  rather  in  decisions 
concerning  the  items  of  materiel  which  the  persons  are  evaluating  in 
some  way.  It  is  assumed  that  when  decisions  are  being  made  concerning 
materiel  being  evaluated,  by  means  of  the  magnitude  estimation  tech¬ 
nique,  data  will  be  available  from  groups  of  at  least  10  to  12  per¬ 
sons.  Therefore,  the  stability  of  the  data  yielded  by  magnitude  esti¬ 
mation  at  the  level  of  groups  of  10  or  12  persons  will  be  evaluated. 

1.  Comparison  of  Geometric  Means  for  the  Three  Trials 

One  method  of  examining  the  stability  of  magnitude  estimation  data 
at  the  level  of  12-person  groups  is  to  compare  the  geometric  means  of 
the  magnitude  estimates  for  each  line  length  for  each  of  the  three 
trials.  Table  4  presents  the  geometric  means  of  the  magnitude  esti¬ 
mates  for  each  trial  separately,  as  well  as  for  all  three  trials  com¬ 
bined,  for  the  first  experiment  in  magnitude  estimation. 
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Table  4.  Geometric  Means  of  Magnitude  Estimates  of  Line  Lengths: 
F irst  Experiment 


Lengths 
of  Lines 

1st  Trial 

2nd  Trial 

3rd  Trial 

3  Trials  Combined 

1/8  in 

0.450 

0.630 

0.479 

0.514 

1  1/8  in 

3.743 

3.913 

4.227 

3.956 

2  1/8  in 

7.088 

7.374 

6.470 

6.967 

3  1/8  in 

8.696 

10.094 

9.503 

9.413 

4  1/8  in 

13.111 

13.250 

12.862 

13.073 

5  1/8  in 

15.529 

16.191 

20.215 

17.278 

6  1/8  in 

16.594 

21.359 

21.594 

19.707 

7  1/8  in 

19.218 

26.018 

25.561 

23.380 

Comparing 

geometric 

means  across 

the  rows  of 

table  4  shows  fairly 

good  stability,  though 

on  the  first 

trial  there 

appears  to  have  been 

some  inhibition  against 

giving  larger 

magnitude  estimates  in  response 

to  the  longest  lines,  in  contrast  to  trials  2  and  3.  The  geometric 
means  of  the  magnitude  estimates  for  each  trial  separately,  and  for 
all  three  trials  combined,  for  the  confirmatory  experiment  in  magni¬ 
tude  estimation  are  presented  in  table  5. 

When  the  geometric  means  in  each  row  of  table  5  are  compared,  con¬ 
siderate  stability  is  apparent.  The  first-trial  reluctance  of  sub¬ 
jects  to  give  larger  magnitude  estimates  in  response  to  the  longer 
lines,  so  apparent  in  the  first  experiment,  was  not  found  in  this  con¬ 
firmatory  experiment. 


Table  5. 

Geometric 

Means  of  Magnitude  Estimates 

of  Line  Lenqths: 

Confirmatory  Experiment 

Lengths 
of  Lines 

1st  Trial 

2nd  Trial 

3rd  Trial 

3  Trials  Combined 

0.125  in 

0.199 

0.167 

0.253 

0.204 

0.22  in 

0.299 

0.335 

0.352 

0.328 

0.40  in 

0.480 

0.435 

0.702 

0.527 

0.70  in 

1.034 

0.968 

0.932 

0.977 

1.25  in 

1.623 

1.675 

2.009 

1.761 

2.22  in 

2.668 

3.097 

3.445 

3.053 

3.95  in 

4.689 

4.774 

5.243 

4.896 

7.03  in 

7.865 

6.995 

7.753 

7.527 

12.50  in 

13.197 

10.899 

13.145 

12.365 

22.23  in 

23.857 

24.907 

24.737 

24.496 

n 


2.  Plot  of  Decibel  Values  for  the  Three  Trials 

Another  method  of  examining  the  stability  of  magnitude  estimation 
data  is  to  plot  the  decibel  values  of  the  magnitude  estimates  for  the 
three  trials  separately,  in  a  fashion  analogous  to  figure  1.*  Figure 
7  shows  such  a  plot  for  the  first  experiment  in  magnitude  es*  • mat  ion. 

The  line  in  figure  7  is  the  best  fitting  line  for  the  points  based 
on  the  geometric  means  of  all  three  trials  combined,  the  same  line  as 
appears  in  figure  1.  Though  there  is  some  scatter  about  this  line, 
the  points  for  the  three  trials  for  a  line  1  1/8  inches  (9.54  deci¬ 
bels)  long  do  not  overlap  with  those  for  a  Mne  2  1/8  inches  (12.30 
decibels)  long,  etc.,  until  reaching  the  three  lines  of  greatest 
length,  where  the  intervals  on  the  decibel  scale  between  lengths  of 
lines  become  quite  small.  The  inhibition  against  giving  larger  magni¬ 
tude  estimates  in  response  to  the  longest  lines  on  the  first  trial, 
noted  above,  shows  up  clearly  in  figure  7. 

Figure  8  shows  a  plot  of  the  decibel  values  of  the  magnitude  esti¬ 
mates  for  the  three  tria.s  separately,  in  the  confirmatory  experi¬ 
ment.  Here,  where  the  intervals  on  the  decibel  scale  for  length  of 
lines  are  equal,  there  is  no  overlap  between  the  points  for  the  three 
trials  with  any  line  length  and  those  for  the  three  trials  with  any 
adjacent  line  length. 

3.  Best-Fitting  Lines  for  Trials  1,  2  and  3 

A  third  method  for  examining  the  stability  of  magnitude  estimation 
is  to  compute  the  best-fitting  lines  for  the  points  of  trials  1,  2, 
and  3.  This  was  done,  and  the  slopes  for  these  three  best-fitting 
lines  are  .93,  .90,  and  .98,  respectively,  for  the  first  experiment  in 
magnitude  estimation.  These  slopes  may  be  compared  with  a  slope  of 
.94  for  the  best-fitting  line  for  the  points  of  all  three  trials  com¬ 
bined.  In  the  confirmatory  experiment  the  slopes  of  the  best-fitting 
lines  for  trials  1,  2  and  3,  respectively,  are  .93,  .93  and  .89. 
These  slopes  may  be  compared  with  a  slope  of  .92  for  the  best-fitting 
line  for  the  points  of  all  three  trials  combined  in  the  confirmatory 
experiment. 


*  The  ratios  of  the  geometric  means  of  the  magnitude  estimates  for 
^  the  three  trials  taken  separately  are  taken  to  the  same  base, 

,  ...514,  which  the  geometric  mean  of  the  magnitude  estimates  in 

*  v  response  to  the  shortest  line  for  all  three  trials  taken  together 

(table  2).  Increasing  or  decreasing  the  base  to  which  these  ra- 
. .tips,  are  taken  simply  lowers  or  raises  the  points  on  the  graph. 

Therefore,  usijig  the  same  base  for  the  ratios  for  the  three  trials 
1*.  taken  separately  provides  a  common  reference  framework  for  the 
-three  sets  of  points. 
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Figure  7. 


Stability  of  Magnitude  Estimation  Data.  First 
Twelve  Subjects,  Three  Trials  on  Each  of  Eight 
Best-fitting  Line  (least  squares):  Y  =  -.09  + 
on  Geometric  Means  of  All  Three  Trials  Combined 


Experiment: 
Line  Lengths 
.94  X,  Based 
(figure  1). 
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Figure  8.  Stability  of  Magnitude  Estimation  Data.  Confirmatory  Ex- 
iment:  Twelve  Subjects,  Three  Trials  on  Each  of  10 

Line  Lengths.  Best-fitting  Line  (least  squares): 
Y  =  -.11  +  .92  X,  Based  on  Geometric  Means  of  All  Three 
Trials  Combined  (figure  5). 
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4.  Stab i  1  it.y  of  Magnitude  Estimation  Data  as  Measured  by  the 
Intraclass  Correlation  Coefficient 


The  intraclass  correlation  coefficient  involves  an  analysis  of 
variance  approach  to  the  stability  (reliability)  of  measurements 
The  mathematical  model  appropriate  for  this  study  of  magnitude  estima¬ 
tion  data  is  that  involved  in  Case  2,  as  described  in  the  recent 
Shrout  and  Fleiss  article  on  intraclass  correlations.'  In  this 
study  we  have  a  sample  of  persons,  each  of  whom  has  rated  (made  magni¬ 
tude  estimates  in  response  to)  each  of  a  number  of  lines  of  different 
lengths.  We  desire  to  generalize  from  this  sample  of  persons  (raters 
or  judges)  to  a  population  of  persons;  therefore,  the  person  or  sub¬ 
ject  variable  in  the  analysis  of  variance  is  a  random  effect.  Fur¬ 
ther,  as  was  discussed  at  the  beginning  of  this  section,  we  are  inter¬ 
ested  in  the  stability  of  the  mean  of  magnitude  estimates  made  by  a 
group  of  10  to  12  persons,  rather  than  in  the  stability  of  a  magnitude 
estimate  made  by  one  person.  Therefore,  the  formula  used  to  compute 
intraclass  correlation  coefficients  in  this  study  was  that  appropriate 
for  mean  ratings  of  a  group  of  raters  (Shrout  and  Fleiss,  p.  426 J . 

This  intraclass  correlation  coefficient  may  be  thought  of  as  the 
ratio  of  the  component  of  variance  due  to  treatments  (in  this  case, 
the  individual  stimulus  lines  of  different  lengths)  to  the  sum  of  the 
components  c>F  vu"'ance  due  to  treatments,  subjects,  interaction  be¬ 
tween  treatments  and  subjects,  and  error.  In  other  words,  this  intra¬ 
class  correlation  coefficient  tells  us  the  proportion  of  the  total 
variance  that  is  accounted  for  by  the  treatments. 

The  intraclass  correlation  coefficients  for  thj  two  magnitude  es¬ 
timation  experiments  are  presented  in  tal  le  6.  It  can  be  seen  that 
the  coefficients  in  the  first  experiment  are  notably  lower  than  those 
in  the  confirmatory  experiment,  but  that  they  increase  from  the  first, 
to  the  second,  to  the  third  trial.  This  means  that  the  variability 
between  subjects  in  magnitude  estimations  was  much  greater  in  the 
first  experiment  than  in  the  confirmatory  experiment,  and  that  this 
variability  decreased  from  trial  to  trial  in  the  first  experiment. 
The  first  magnitude  estimation  experiment  was  done  at  the  very  begin¬ 
ning  of  this  series  of  laboratory  experiments.  No  ready  explanation 
has  been  developed  for  this  trend  towards  increasing  stability  in  the 
first  experiment.  It  is  apparent  from  the  intraclass  correlation  co¬ 
efficients  obtained  in  the  confirmatory  experiment,  however,  that 
highly  stable  (reliable)  magnitude  estimation  data  can  be  obtained. 


6  Winer,  B.  J. 
pp.  283-296. 


Statistical  Principles  in  Experimental  Design, 


7  Shrout,  P.  E.,  and  Fleiss,  J.  L.  "Intraclass  Correlations:  Uses 
in  Assessing  Rater  Reliability,"  Psychological  Bulletin, 
pp.  420-428. 
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Table  6.  Intraclass  Correlation  Coefficients 


Trials 

1  r  3 

First  Experiment  .57  .61  .67 

Confirmatory  Experiment  .94  .94  .94 


Since  the  intraclass  correlation  coefficient  is  a  ratio  of  variance 
accounted  for  by  the  treatments  to  tot.^1  variance,  these  obtained  in¬ 
traclass  correlation  coefficients  of  .94  are  equivalent  to  reliability 
coefficients  of  the  usual  kind  of  .97  (=  the  square  root  of  .94). 

Stability  of  line  production  data  was  found  to  be  approximately 
the  same  as  that  of  the  magnitude  estimation  data.  However,  it  is  not 
presented  and  discussed  in  this  report,  because  the  magnitude  estima¬ 
tion  response  mode  has  been  chosen  as  the  most  feasible  and  convenient 
response  mode  for  use  in  human  factors  evaluations. 


SECTION  V.  FIELD  STUDIES  OF  MAGNITUDE  ESTIMATION 


After  some  experience  with  laboratory  studies  of  magnitude  estima¬ 
tion,  field  studies  were  undertaken.  Three  different  field  studies 
were  performed:  (1)  a  comparison  of  Personnel  Armor  System  for  Ground 
Troops  (PASGT)  helmets  and  vests  with  standard  helmets  and  vests  with 
respect  to  comfort;  (2)  a  comparison  of  four  different  machine  guns 
with  respect  to  perceived  accuracy,  ease  of  opening  and  several  other 
features  of  the  weapons;  and  (3)  a  study  in  which  soldiers  carried 
loads  ranging  from  20  to  50  pounds  over  a  4-kilometer  course  in  the 
jungle  and  were  asked  to  give  magnitude  estimates  of  the  difficulties 
of  various  parts  of  the  course. 

In  field  studies  of  magnitude  estimation,  it  is  usually  not  pos¬ 
sible  to  control  and  measure  stimulus  variables.  This  means  that 
clearcut  relationships  between  physical  stimulus  variables  and  subjec¬ 
tive  response  measures  cannot  be  shown,  as  was  done  in  the  laboratory 
experiments.  In  addition,  the  subjective  variables  measured  with  the 
magnitude  estimation  technique  in  field  studies  are  likely  to  be  com¬ 
plex  functions  of  a  number  of  physical  stimulus  variables  acting  to¬ 
gether.  Thus,  perceived  difficulties  of  parts  of  the  4-kilometer  jun¬ 
gle  course  are  likely  to  depend  not  only  on  the  loads  carried  by  the 
soldiers,  but  also  on  temperature  and  humidity,  whether  it  is  raining 
or  not,  the  physical  condition  of  individual  soldiers,  and  a  host  of 
conditions  internal  to  individual  soldiers  which  may  be  lumped  togeth¬ 
er  under  a  label  such  as  "morale4'  or  "motivation." 

Having  demonstrated  that  the  magnitude  estimation  technique  does  a 
good  job  of  measuring  subjective  variables  in  laboratory  experiments, 
where  the  physical  stimulus  can  be  precisely  controlled  and  where  a 
close  relationship  can  be  shown  between  the  physical  stimulus  and  a 
subjective  variable,  the  next  step  is  to  use  magnitude  estimation  in 
field  studies  involving  human  factors  evaluations.  In  these  field 
studies,  where  stimulus  variables  cannot  be  precisely  controlled  and 
subjective  variables  are  likely  to  depend  on  a  number  of  stimulus  var¬ 
iables  acting  together  in  a  complex  fashion,  it  will  probably  not  be 
possible  to  demonstrate  close  relationships  between  objective  stimulus 
variables  and  subjective  variables,  as  was  done  in  the  laboratory  ex¬ 
periments.  Rather,  it  will  be  assumed  that  magnitude  estimation,  when 
used  in  the  less  controlled  and  defined  setting  of  field  experiments, 
will  continue  to  do  a  competent  job  of  measuring  subjective  variables, 
as  it  did  in  the  laboratory  experiments. 

A.  COMPARISON  OF  HELMETS  AND  VESTS 


Data  Acquisition  Procedures 


Magnitude  estimation  data  were  gathered  during  the  Development 
Test  II  of  the  Personnel  Armor  System  for  Ground  Troops  (PASGT),  which 
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the  US  Army  Tropic  Test  Center  carried  out  in  1976-1977.8  in  this 
test,  soldiers  traversed  a  d-kilometer  jungle  course  (known  as  the 
Man-Pack  Portability  Course  (MPPC),  described  in  detail  in  refer¬ 
ence)^  repeatedly  over  a  period  of  several  days,  each  time  wearing  a 
different  helmet-vest  combination.  The  soldiers  also  performed  on  a 

laser-rifle  range  and  a  land  navigation  course  in  the  jungle  each  day 

before  they  traveled  the  MPPC  and  again  after  they  traveled  the  MPPC. 
At  the  end  of  the  test  period,  when  the  soldiers  had  worn  all  of  the 
helmet-vest  combinations,  20  of  them  were  asked  to  make  magnitude  es¬ 
timations  of  the  comfort  of  the  helmets  and  vests. 

The  following  helmet-vest  combinations  were  worn  by  the  soldiers: 

a.  Kevlar  helmet  (38  oz/ft?)  (PASGT-1)  with  Kevlar  vest. 

b.  Kevlar  helmet  (30  oz/ft^)  (PASGT-2)  with  Kevlar  vest. 

c.  Standard  M-l  helmet  with  standard  B-nylon  vest. 

The  following  instructions  were  given  to  the  soldiers  as  the  mag¬ 
nitude  estimate  data  were  gathered: 

We're  trying  out  a  new  way  of  asking  you  what  you  think  of 
this  equipment.  Think  about  the  overall  comfort  of  the  two 
different  vests  you  wore.  Let's  say  that  a  very  large  number 
represents  the  most  comfortable  vest  you  can  think  of,  and  a 
very  small  number  represents  the  most  uncomfortable.  Now 
think  of  a  number  that  represents  how  comfortable  or  uncom¬ 
fortable  you  think  the  standard  vest  was.  Please  write  this 
number  in  the  blank  space  next  to  "Standard  Vest." 

Now  think  of  another  number  that  represents  how  comfortable 
or  uncomfortable  you  thought  the  new  vest  was.  Please  write 
this  number  in  the  blank  space  next  to  "New  Vest."  Remember, 
if  you  thought  the  new  vest  was  more  comfortable  than  the 
standard  vest,  you  should  pick  a  larger  number  than  you  did 
for  the  standard  vest.  If  you  thought  the  new  vest  was  less 
comfortable  than  the  standard  vest,  you  should  pick  a  smaller 
number  than  you  did  for  the  standard  vest.  The  bigger  the 
difference  in  comfort,  the  bigger  the  difference  between  the 
numbers  you  pick  should  be. 

Now,  let's  think  about  the  comfort  of  the  three  different 
helmets  you  wore.  Please  write  numbers  in  the  spaces  next  to 
"PASGT-1  Helmet,"  "PASGT-2  Helmet,"  and  "Standard  Helmet,"  to 
show  how  comfortable  you  felt  each  helmet  was.  Remember,  the 


8  Haverland,  E.  M.;  Novak,  C.  A.;  Johnson,  R.  1.,  Jr.;  Williamson, 
R.  L.;  and  Kindick,  C.  M.  Development  Test  II  of  Personnel  Armor 
System  for  Ground  Troops  (PASGT). 

9  Test  Operations  Procedure  (TOP)  1-3-550,  Man-Pack  Portability 
Testing  in  the  Tropics. 


more  comfortable  you  felt  the  helmet  to  be,  the  larger  the 
number;  and  the  less  comfortable,  the  smaller  the  number. 

2.  Results 

The  arithmetic  mean  of  the  magnitude  estimates  of  the  comfort  of 
the  standard  vest  was  2.15,  while  that  for  the  new  vest  (PASGT)  was 
53.00.*  This  difference  seems  quite  large,  but  the  between-subjects 
variability  was  also  large,  since  some  subjects  restricted  their  mag¬ 
nitude  estimates  to  as  little  as  two  points,  while  others  let  their 
ignitude  estimates  range  over  more  than  200  points.  Nevertheless,  a 
ijrrelated  t-test  for  difference  between  the  magnitude  estimates  for 
tandard  and  new  vests  yielded  a  value  for  t  of  3.54.  With  19  degrees 
<>f  freedom  and  a  two-tailed  test,  the  probability  of  obtaining  a  value 
>  t  this  large,  if  there  were  no  difference  in  magnitude  estimates  of 
i omf ort  for  the  two  vests,  is  less  than  .01. 

The  arithmetic  means  of  the  magnitude  estimates  of  the  comfort  of 
the  three  helmets  are  shown  below: 

PASGT- 1  Helmet  50.90 

PASGT-2  Helmet  57.55 

Standard  Helmet  2.25 

A  repeated  measures  analysis  of  variance  of  the  magnitude  estimates  of 
the  comfort  of  the  three  helmets  is  given  in  table  7. 


Table  7.  Analysis  of  Variance  of  Magnitude  Estimates 
for  PASGT-1,  PASGT-2  and  Standard  Helmetc 


Source  Variance 

SS 

df 

MS 

F 

_ _P _ 

Between  subjects 

71,572.73 

19 

-- 

-- 

Within  subjects 

99,648.67 

40 

-- 

-- 

-- 

Helmets 

36,460.90 

2 

18,230.45 

10.96 

<.001 

Error 

63,187.77 

38 

1,662.84 

-- 

-- 

TOTAL 

171,221.40 

59 

Though  magnitude  estimation  yields  measurement  on  a  ratio  scale, 
special  statistical  techniques  for  determining  the  significance  of 
differences  between  geometric  means  are  not  available.  Therefore, 
the  usual  statistical  techniques  (t-tests  and  analysis  of  vari¬ 
ance)  which  are  appropriate  for  determining  differences  between 
arithmetic  means  are  used  in  this  report. 
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3.  Discussion 


These  resu1ts--the  new  vest  (PASGT)  being  rated  more  comfortable 
than  the  standard  vest,  and  both  of  the  two  new  PASGT  helmets  being 
rated  more  comfortable  than  the  standard  helmet--agree  with  the  re¬ 
sults  of  a  large  number  of  both  objective  performance  test  results  and 
subjective  questionnaire  results  obtained  during  the  PASGT  test  and 
documented  in  the  test  report  (Haverland,  et.  al_. ).  The  results  ob¬ 
tained  with  magnitude  estimation  appear  more  clear-cut  than  do  the 
results  of  the  performance  tests  and  questionnaires.  Of  course,  the 
performance  tests  and  questionnaires  covered  a  much  wider  variety  of 
variables  than  the  magnitude  estimates  of  comfort,  and  for  this  reason 
should  be  depended  upon  as  giving  much  more  comprehensive  evidence  of 
the  superiority  of  the  PASGT  equipment  to  the  standard  vest  and  hel¬ 
met,  than  the  magnitude  estimates  of  comfort.  Nevertheless,  magnitude 
estimation  appears  to  have  done  a  good  job  of  measuring  differences  in 
comfort  between  the  new  PASGT  equipment  and  the  old  standard  vest  and 
helmet. 

B.  COMPARISON  OF  FOUR  MACHINE  GUNS 

1.  Data  Acquisition  Procedures 

Magnitude  estimation  data  were  gathered  near  the  end  of  the  Ma¬ 
chine  Gun  Accuracy  and  Dispersion  (MAD)  test  conducted  at  USATTC  dur¬ 
ing  1977-1973.  In  this  test  nine  different  machine  guns  were  fired  at 
32-foot  square  targets  (at  ranges  of  300  and  600  meters)  by  regular 
troops  (MOS  11 B ,  Infantryman,  M60  machine  gun  qualified)  from  the  193d 
Infantry  Brigade  (Canal  Zone),  and  large  amounts  of  data  on  miss-dis¬ 
tances  were  gathered.  The  10  soldiers  who  provided  the  magnitude  es¬ 
timation  data  had  fired  these  particular  four  machine  guns  over  a  per¬ 
iod  of  8  weeks. 

The  four  machine  guns  were: 

a.  MG-1A3,  a  German  7.62-nm  weapon  with  a  heavy  barrel.  The  wea¬ 
pon  weighed  approximately  35  pounds. 

b.  RPK,  a  Soviet  7.62-nm  weapon--a  member  of  the  AK47  family.  It 
is  3  light  weapon,  weighing  approximately  15  pounds. 

c.  PKM,  a  Rumanian  7.62-mm  weapon.  It  is  of  intermediate  weight, 
approximately  23  pounds. 

d.  M60,  the  standard  US  Army  machine  gun  (7.62  mm).  It  is  also 
of  intermediate  weight,  weighing  approximately  23  pounds.  The  sol¬ 
diers  had  had  more  experience  with  this  weapon  than  with  the  other 
three,  having  fired  it  extensively  before  they  participated  in  this 
test . 
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The  soldiers  were  given  the  following  instructions: 

Number  Rating  of  the  MAO  Weapons 

You  have  fired  four  different  weapons  during  the  time  you 
have  been  helping  us  on  this  project:  the  MG-1A3  (German 
7.62  mm),  RPK  (Soviet),  PKM  (Rumanian),  and  M6Q  machine  gun. 
Now,  we're  trying  out  a  new  way  of  asking  you  what  you  think 
of  these  weapons.  To  start  with,  think  about  the  accuracy  of 
the  weapons  you've  fired.  Let's  say  that  a  very  small  number 
represents  a  weapon  that  was  very  inaccurate,  and  a  very 
large  number  represents  a  weapon  that  was  extremely  accurate. 

Now  think  of  a  number  that  represents  how  accurate  or  inac¬ 
curate  you  thought  the  MG-1A3  was.  You  should  not  use  zero 
or  negative  numbers.  Please  write  this  number  in  the  blank 
space  on  the  first  line  below  MG-1A3. 

Now  think  of  another  number  that  represents  how  accurate  or 
inaccurate  you  thought  the  RPK  was.  Please  write  this  number 
in  the  blank  spaces  on  the  first  line  below  RPK.  Remember, 
if  you  thought  the  RPK  was  more  accurate  than  the  MG-1A3,  you 
should  pick  a  larger  number  than  you  did  for  the  MG-1A3.  If 
you  thought  the  RPK  was  less  accurate  than  the  MG-1A3,  you 
should  pick  a  smaller  number  than  you  did  for  the  MG-1A3. 
The  bigger  the  difference  in  accuracy,  the  bigger  the  differ¬ 
ence  between  the  numbers  you  pick  for  the  two  weapons  should 
be. 

Now,  go  ahead  and  pick  numbers  to  represent  the  accuracy  or 
inaccuracy  of  the  PKM  and  the  M60  and  write  them  in  the  spac¬ 
es  on  the  first  line  under  PKM  and  M60.  The  more  accurate 
you  felt  the  weapon  was,  the  larger  the  number  you  should 
choose  for  it.  Each  of  you  should  choose  your  numbers  by 
yourself,  without  talking  to  anybody  else  about  it.  You  can 
talk  aoout  it  after  we're  finished. 

Wait  until  everybody  has  finished  the  first  I  ine  on  accuracy. 

Now  think  about  how  easy  or  hard  it  was  to  open  the  weapons. 
If  you  thought  it  was  easy  to  open  a  weapon,  you  should  give 
that  weapon  a  large  number--the  easier  st  was  to  oDen  the 
weapon,  t fie  larger  its  number  should  be.  If  you  thought  it 
hard  to  open  a  weapon,  you  should  give  that  weapon  a  small 
number--the  harder  it  was  to  open,  the  smaller  its  numbe1' 
should  be.  Again,  you  should  not  use  zero  or  negative  num¬ 
bers.  Now  let's  go  ahead  and  put  down  numbers  for  how  easy 
or  hard  you  thought  it  was  to  open  the-  tour  weapons. 


Are  there  any  questions?  Now  lot's  go  ahead  and  put  down 
numbers  for  the  other  sir  things  I'm  asking  you  ibout  these 


weapons.  Think  about  each  question  for  a  minute  or  two,  and 

then  put  down  a  number  for  each  of  the  four  weapons. 

The  data  sheet  on  which  the  soldiers  wrote  their  magnitude  estimates 
is  reproduced  below,  with  means  of  obtained  data  entered  in  response 
spaces  (table  8). 

The  instructions  were  given  to  the  10  soldiers  in  a  group,  and 
they  went  ahead  with  making  their  magnitude  estimates.  As  the  in¬ 
structions  were  given,  one  soldier  asked  if  he  should  use  a  scale  from 
1  to  10.  The  experimenter  explained  that  they  should  use  any  numbers 
they  wanted,  other  than  0  or  negative  numbers,  but  all  10  of  the  sol¬ 
diers  apparently  followed  the  suggestion  implicit  in  this  question  and 
restricted  their  estimates  to  the  range  of  1  to  10.  This  certainly 
reduced  the  between-subjects  variance  of  the  magnitude  estimates,  and 
beyond  this,  it  is  hard  to  guess  what  the  effects  of  this  restriction 
might  have  been,  compared  with  the  usual  use  of  a  wider  range  of  num¬ 
bers  . 

2.  Results 

The  arithmetic  means  of  the  magnitude  estimates  for  the  four  ma¬ 
chine  guns  for  each  of  the  eight  questions  asked  are  presented  in  ta¬ 
ble  8. 

Eight  repeated  measures  analyses  of  variance  were  carried  .at  on 
the  data  from  which  the  means  in  table  8  were  computed.  These  analy¬ 
ses  of  variance  are  presented  iri  table  9.  From  the  data  in  table  9, 
it  can  be  seen  that  the  magnitude  estimates  differed  significantly 
among  the  four  machine  guns  for  questions  2,  S,  and  7.  Referring  to 
table  8,  it  can  be  seen  that  the  MG-1A3  ano  M60  machine  guns  were  con¬ 
sidered  easier  to  open  than  the  RPK  and  PKV  machine  guns  (question 
2).  It  was  considered  easier  to  operate  the  charging  handle  on  the 
MG-1A3  than  on  the  other  three  machine  guns  (quec*  f ' .  And  using 
the  safety  was  considered  easier  on  the  MG-1A3  and  machine  nuns 

than  on  the  RPK  and  PKM  machine  guns  (question  7). 

3 .  Discus  sjon 

Use  of  the  magnitude  estimation  technique  appears  to  have  been 

successful  in  measuring  the  soldiers'  subjective  responses  to  the  four 
machine  guns,  in  that  statistically  significant  results  were  obtained 
for  three  of  the  eight  questions.  The  results  tor  these  three  ques¬ 
tions  were  in  accordance  with  the  soldiers'  informal  opinions  of  the 

machine  guns.  As  noted  earlier,  it  is  impossible  to  estimate  the  ef- 
fei  ts  that  the  soldiers'  use  of  a  1  to  10  scale  may  have  had  on  these 
results.  An  ohjertive,  eternal  criterion  was  available  for  only  one 
ut  the  eight  guest i ons-- that  concerning  accuracy.  Average  horizontal 
and  vertical  miss-distances  for  the  four  machine  guns  at  both  300  and 
bOG  meters  were  obtained  from  the  dr-aft  report  of  the  MAD  test,  and 

are  presented  in  table  10.  Examination  of  these  miss-distances  shows 


Table  8.  Arithmetic  Means  of  Magnitude  Estimates  for  Four  Machine  Guns 
Entered  on  Data  Collection  Form 


MG-1A3 

RPK 

PKM 

M60 

1.  The  weapon  was-- 

6.2 

c  .2 

6.6 

8.0 

Accurate:  big  number 
Inaccurate:  small  number 

2.  The  weapon  was— 

8.3 

7.3 

7.1 

8.5 

Easy  to  open:  big  number 

Hard  to  open:  small  number 

3.  Aiming  from  the  bipod  was-- 

7.0 

5.7 

6.6 

7.4 

Easy:  big  number 

Hard:  small  number 

4.  Firing  from  the  bipod  was-- 

7.2 

7.0 

7.1 

7.1 

Easy:  big  number 

Hard:  small  number 

5.  Operating  the  charging 
handle  was-- 

8.6 

6.7 

6.1 

6.6 

Easy:  big  number 

Hard:  small  number 

6.  Squeezing  the  trigger  was-- 

8.1 

7.3 

7.0 

8.1 

Easy:  big  number 

Hard:  small  number 

7.  Using  the  safety  was-- 

8.3 

7.0 

6.9 

8.3 

Easy:  big  number 

Hard:  small  number 

8.  Overall,  do  you  consider 
this  weapon-- 

7.7 

6.7 

7.0 

8.2 

Good:  big  number 
Poor:  small  number 
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no  consistent  pattern;  none  of  the  machine  guns  appear  to  be  more  ac¬ 
curate  than  any  other  machine  gun.  This  lack  of  a  consistent  pattern 
in  the  objective  accuracy  data  is  consonant  with  the  fact  that  there 
were  no  statistically  significant  differences  in  subjective  judgments 
as  to  the  accuracy  of  the  four  machine  guns. 

C.  MAGNITUDE  ESTIMATION  OF  DIFFICULTY  OF  MAN-PACK  PORTABILITY  COURSE 


The  Man-Pack  Portability  Course  (MPPC)  is  the  same  4-ki lometer 
jungle  course  as  was  used  in  the  PASGT  test  of  helmets  and  vests  men¬ 
tioned  earlier.  Detailed  descriptions  of  the  development  and  intended 


Table  10.  Average  Miss-Distances  for  Four  Machine  Guns 


Machine  Gun 


_ Miss-Distances* _ 

300  meters  600  meters 

Horizontal  Vertical  Horizontal  Vertical 


MG-1A3 

-23.9 

9.0 

-54.8 

13.5 

RPK 

-15.3 

-17.7 

13.9 

-1.3 

PKM 

-14.8 

28.7 

-25.2 

39.5 

M60 

-19.8 

-10.6 

-18.4 

7.2 

Each  average  miss-distance  is  based  on  500  rounds  (50  rounds  by 
each  of  10  soldiers).  Miss-distances  are  in  inches.  Negative 
values  are  to  the  left  of,  or  below  the  bullseye;  positive  values 
are  to  the  right  of,  or  above  the  bullseye. 


uses  of  this  course  may  be  found  in  Test  Operations  r  ocedure  (TOP) 
1-3-550  (1973),  Williamson  and  Kindick  (1974)10  ancj  Williamson  and 
Kiraick  (1975)11.  Groups  of  soldiers  carried  four  different  loads 
in  crder  to  introduce  variation  in  difficulty  of  traversing  the 
oo  Tne  soldiers  were  then  asked  to  give  magnitude  estimates  of 

fh--  ii  'iculty  of  several  parts  of  the  course. 

1.  Details  of  Experimental  Procedures 

Four  groups  of  five  soldiers  each  traversed  the  MPPC  on  each  of  4 
consecutive  days,  5-8  September  1978.  The  four  groups  started  the 
course  at  approximately  30-minute  intervals,  so  they  would  not  encoun¬ 
ter  each  other  on  the  course.  Each  group  took  2  to  3  hours  to  tra¬ 
verse  the  course,  and  on  each  day  the  groups  traversed  the  course  gen¬ 
erally  between  0800  and  1200  hours.  Each  of  the  four  groups  of  sol¬ 
diers  came  from  a  different  company  of  the  4th  Battalion  (Mech),  20th 
Infantry,  stationed  at  Fort  Clayton,  Canal  Zone  (Companies  A,  B  and  C, 
and  the  Combat  Support  (CS)  Company).  Four  persons  from  USATTC  tra¬ 
versed  the  course  with  the  soldiers,  one  with  each  group,  serving  as 
timers  of  the  performances  of  the  men  on  various  parts  of  the  course. 
Each  of  the  soldiers  carried  his  weapon  (M-16)  and  a  cartridge  belt 
with  two  canteens,  in  addition  to  a  pack  which  was  loaded  to  one  of 
four  weights:  20-25  pounds,  30-35  pounds,  40-45  pounds,  and  50-55 


10  Williamson,  R.  L.,  and  Kindick,  C.  M.  Human  Performance  in  the 

Tropics  I:  Man-Packing  a  Standard  Load  Over  a  Typical  Jungle 
Course  in  the  Wet  and  Dry  Season. 

11  Williamson,  R.  L.,  and  Kindick,  C.  M.  Human  Performance  in  the 

Tropics  II:  A  Pilot  Study  on  Load-Carrying  Test  Methodology . 
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pounds.  The  loads  carried,  the  order  of  traversing  the  course,  aid 
the  assignment  of  timers  to  the  groups  were  arranged  in  a  balanced 
fashion,  as  shown  in  table  11.  It  can  be  seen  in  table  11  that  eac. 
group  carried  each  of  the  four  different  loads,  traversed  the  course 
once  in  each  of  the  four  turns  (1st,  2nd,  3rd,  and  4th),  and  was  as¬ 
signed  each  of  the  four  different  timers. 


Table  11.  Experimental  Design  for 

Loads  Carried, 

r'rner  of 

Order  of 
Traversi 
MPPC 

Traversing  the  MPPC  and 

Assignment 

_of_ 

T imers 

ng 

Days 

5  Sep 

6  Sep 

7  Sep 

8  Sep 

Co.  CS 

Co.  C 

Co.  B 

Co.  A 

1st 

20-25  lbs 

50-55  lbs 

50-55  lbs 

20-25  lbs 

(9.1-11.3  kg) 

(22.7-24.9  kg) 

(22.7-24.9 

kg) 

(9.1-11.3  kg) 

Timer  4 

Timer  4 

Timer  4 

Timer  4 

2nd 

Co.  C 

Co.  CS 

Co.  A 

Co.  B 

30-35  lbs 

40-45  lbs 

40-45  lbs 

30-35  lbs 

(13.6-15.9 

kg) 

(18.1-20.4  kg) 

(18.1-20.4 

kg) 

(13.6-15.9 

kg) 

Timer  3 

Timer  2 

Timer  2 

Tinfer  3 

3rd 

Co.  8 

Co.  A 

Co.  CS 

Co.  C 

40-45  lbs 

30-35  lbs 

30-35  lbs 

40-45  lbs 

(18.1-20.4 

kg) 

(13.6-15.9  kg) 

(13.6-15.9 

kg) 

(18.1-20.4 

kg) 

Timer  2 

Timer  3 

Timer  3 

Timer  2 

4th 

Co.  A 

Co.  B 

Co.  C 

Co.  CS 

50-55  lbs 

20-25  lbs 

20-25  lbs 

50-55  lbs 

(22.7-24.9 

kg) 

(9.1-11.3  kg) 

(9.1-11.3  kg) 

(22.7-24.9 

kg) 

Timer  1 

Timer  1 

Timer  1 

Timer  1 

The  same  five  persons  remained  in  each  of  the  four  groups  for  the 
4  days,  with  the  exception  of  some  necessary  substitutions.  Of  the 
total  of  80  individual  traverses  of  the  MPPC  (20  persons  x  4  days),  10 
were  accomplished  by  substitutes.  Thus  the  substitution  rate  was  12.5 
percent.  Twelve  of  the  20  soldiers  who  traversed  the  course  on  the 
first  day  were  present  and  traversed  the  course  on  each  of  the  3  re¬ 
maining  days. 

A  traversal  of  the  MPPC  consists  of  several  different  parts,  as 
follows: 

--Forced  March,  5,200  feet,  group  timed 
-15-minute  break 

--Walk  to  Frijoles  River,  3-minute  break  while  crossing  river 
—Walk  to  beginning  of  Uphill  Run  (timer  goes  ahead  to  top  of  hill) 
--Uphill  Run,  300  feet,  individually  timed 
-15-minute  break 
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—Walk  down  hill  to  Frijoles  River,  3-minute  break 
--Walk  to  marker  283 
--5-min’ute  break 

--Walk  to  beginning  of  Double  Time  (timer  goes  ahead  to  end  of 
Double  Time  Course) 

--Double  Time,  200  feet,  individually  timed 
--5-minute  break 
--Walk  to  end  of  course 

The  Total  Time  required  by  each  group  to  traverse  the  MPPC  was 
recorded.  To  obtain  a  time  for  each  group  for  the  Normal  Walk  por¬ 
tions  of  the  course,  the  group  time  for  the  Forced  March,  the  sum  of 
the  individual  times  for  the  Uphill  Run,  the  sum  of  the  individual 

times  for  the  Double  Time,  and  46  minutes  of  breaks  were  subtracted 

from  the  Total  Time  required  by  the  group  to  traverse  the  course.  The 
men  in  each  group  were  identified  by  designations  (A-l,  A-2,  .  .  ., 

A- 5 ;  B-l,  .  .  .,  B-5;  C-l,  .  .  .,  C-5;  CS-1 . CS-5)  written  on 

strips  of  white  engineer  tape  tied  around  their  upper  arms. 

Both  before  they  traversed  the  MPPC  and  after  they  had  traversed 
the  MPPC,  the  soldiers  were  asked  to  strip  to  their  shorts  to  be 
weighed.  Likewise,  their  canteens  (without  cup  or  cover)  were  weighed 
when  full  before  traversing  the  course.  Body  weights  were  recorded  to 
tiie  nearest  10th  of  a  pound,  and  canteen  weights  to  the  nearest 

ounce.  Body  weight  after  traversing  the  course  was  subtracted  from 
boiv  ..'eight  before  traversing  the  course  to  obtain  body  weight  loss. 
LiK-.-wi.e,  weight  of  empty  or  partially  empty  canteens  was  subtracted 
from  weight  of  full  canteens  to  obtain  the  weight  of  water  drunk  by 
each  soidier  while  traversing  the  course.  Weight  of  water  drunk  was 
then  added  to  body  weight  loss  to  obtain  a  measure  of  sweat  loss  for 
each  soldier  during  traversal  of  the  course.  Finally,  sweat  loss  was 
expressed  as  a  percentage  of  initial  body  weight. 

Thus,  on  each  of  the  4  days  the  foTowing  objective  data  were  col- 
1 ecte^: 

--Forced  March  time,  group  measure  (N  =  4),  to  nearest  minute 
--Uphill  Run  time,  individual  measure  (N  =  20),  to  nearest  second 
--Double-Time  time,  individual  measure  (N  =  20),  to  nearest  second 
--Total  Time,  group  measure  (N  =  4),  to  nearest  minute 
--Normal  Walk  time,  group  measure  (N  =  4),  to  nearest  minute 
--Percent  Initial  Body  Weight  Lost,  individual  measure  (N  =  20),  to 
nearest  0.1  percent. 

It  is  recognized  that  Total  Time  is  not  independent  of  the  other 
times,  because  Total  Tine  is  a  composite  of  the  other  scores. 

After  each  group  finished  traversing  the  MPPC  on  each  day,  subjec¬ 
tive  data  were  gathered  by  asking  each  soldier  to  make  magnitude  esti¬ 
mates  of  the  difficulty  of  the  following  parts  of  the  course: 
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—Forced  March 
—River  Crossing 
—Uphill  Run 
—Double  Time 
—Walking  Parts 

Weather  conditions  were  typical  rainy  season  conditions  for  the 
Canal  Zone,  with  frequent  afternoon  and  evening  rains.  Ihus  the 
course,  which  is  almost  entirely  under  jungle  canopy,  was  .:'.e  slip¬ 
pery  and  muddy  on  all  4  days.  However,  it  did  not  ^  4.  while  the 
groups  were  actually  traversing  the  course,  wirh  ths  exception  of  the 
last  15  minutes  of  the  traversal  by  the  last  group  on  the  4th  day. 

On  the  1st  day  when  the  soldiers  arrived  at  the  site  of  the  MPPC, 
they  were  given  the  following  explanation  of  what  they  were  to  do: 

We're  trying  out  a  new  method  of  measuring  what  you  think 
about  some  job  or  task.  We're  going  to  ask  you  to  go  over 
what  we  call  a  Man-Pack  Portability  Course  in  the  jungle. 

It's  about  4  kilometers  long  and  has  several  parts  in  it. 

The  course  is  marked  with  yellow  arrows  nailed  to  trees.  You 
may  find  some  fallen  trees  across  the  course,  but  just  go 
ahead  and  follow  the  arrows,  climbing  over  or  going  around 
any  obstacles.  After  you've  gone  over  the  course,  I  will  ask 
you  to  tell  me  how  difficult  the  parts  of  the  course  were  by 
simply  giving  me  a  number  for  each  part  of  the  course;  the 
more  difficult  the  part  of  the  course,  the  bigger  the  number 
you  should  give  me. 

You  will  go  over  the  Man-Pack  Portability  Course  in  groups  of 
five  men,  one  group  from  each  Company.  On  some  parts  of  the 
course  we  will  time  you  as  a  group,  and  on  other  parts  we 
will  time  each  one  of  you  separately.  We're  interested  in 
how  much  weight  you  lose  in  sweat  as  you  go  over  the  course, 
so  we'll  weigh  you  before  you  go  onto  the  course,  and  after 
you  come  off  it.  We  will  ask  you  to  strip  to  your  shorts 
when  we  weigh  you,  and  we  also  want  to  weigh  your  canteens. 

Now,  each  company  came  out  here  with  a  set  of  five  loads  to 
carry  on  the  course.  Some  of  the  loads  are  heavy  and  some 

are  lighter.  We'll  switch  these  loads  around  each  day,  so 

that  each  of  you  will  carry  all  four  different  loads,  from 
light  to  heavy,  over  the  4  days  we'll  be  out  here.  We  will 
start  today  with  the  men  from  each  company  carrying  the  loads 
they  brought  with  them. 

Here's  a  chart  that  tells  you  a  little  about  what's  in  the 

Man-Pack  Portability  Course:  first  there's  a  5,200-foot 
Forced  March,  nearly  a  mile.  We  time  you  as  a  group  on 

this.  Then  you  take  a  15-minute  break.  Then  you  walk  on 
down  to  the  Frijoles  river  and  take  a  3-minute  break  here. 


42 


1 


Please  don't  drink  the  river  water  because  it's  probably  con¬ 
taminated.  After  you  cross  the  river,  you  walk  part  way  up  a 
hill,  to  the  starting  point  of  the  300-foot  Uphill  Run. 

Then,  one  at  a  time,  you  run  as  fast  as  you  can,  300  feet  up 
the  hill.  You'll  be  timed  individually  on  this.  At  the  top 
of  the  hill  you  take  another  15-minute  break.  After  the 
break  you  walk  down  the  hill  to  the  river  where  you-  take  a 
3-minute  break.  Again,  don't  drink  the  river  water.  Then 
you'll  walk  on  along  the  course  to  Arrow  No.  283,  and  take  a 
5-minute  break  here.  You'll  continue  walking  to  the  starting 
point  of  the  200-foot  Double  Time.  Here,  one  at  a  time,  you 
run  as  fast  as  you  can  for  200  feet  over  level  ground  in  the 
jungle.  We  time  you  individually  on  this.  After  all  of  you 
in  a  group  have  completed  the  Double  Time,  you  take  a  5- 
minute  break.  Then  you  walk  on  back  here  to  the  end  of  the 
course. 

Now  let's  get  you  lined  up  into  four  five-man  groups  by  com¬ 
panies,  and  get  you  identified  with  some  tape  markers.  We'll 
weigh  you  first,  and  then  when  you  have  put  your  uniforms 
back  on,  we'll  put  some  tape  markers  on  you.  The  group  from 
Combat  Support  Company,  carrying  20-25-pound  packs,  will  be 
first.  When  you  get  back  from  the  course,  I  will  be  asking 
you  questions  about  how  difficult  you  thought  it  was. 

As  the  men  in  the  first  group  were  being  weighed  and  fitted  with 
tape  markers  on  their  arms,  the  following  instructions  were  given  to 
the  four  timers: 

Your  main  job  is  to  time  the  soldiers  as  they  go  through  the 
Man-Pack  Portability  Course--time  the  performance  events  and 
the  breaks,  etc.  Otherwise,  you  should  leave  natters  such  as 
setting  the  pace  on  the  Forced  March  and  walking  parts  of  the 
course  to  the  NCOIC  of  the  group.  Of  course,  you  should  be 
alert  and  see  that  none  of  the  men  get  off  the  course. 

Each  of  you  should  have  a  wristwatch,  a  stopwatch  and  a  whis¬ 
tle.  Here  are  the  cards  on  which  the  times  are  recorded.  Let 
me  quickly  go  over  them  with  you. 

Group— should  be  A,  B,  C,  or  CS--the  Company  the  men  are  from. 

Date— be  sure  this  is  correct,  otherwise  the  data  will  be 
confused. 

Test  Item— leave  blank. 

Start  Time— wristwatch  time  when  you  start  course,  to  the 
minute. 
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You' 11  be  using  the  Yellow  Course,  so  I  have  scratchea  out 
the  marker  numbers  for  the  Red  Course. 

The  Forced  March  goes  from  the  start  of  the  Course  to  Marker 
145.  You  can  use  either  your  wristwatch  or  the  stopwatch  to 
time  this,  to  the  mi.nute. 

At  Marker  145,  the  end  of  the  Forced  March,  take  a  15-minute 
break.  Then  walk  at  a  normal  easy  pace  down  to  the  Frijoles 
River.  Take  a  3-minute  break  here,  and  warn  the  men  not  to 
drink  river  water  or  J'ill  their  canteens. 

Then  walk  to  the  beginning  of  the  Uphill  Run,  where  you'll 
see  the  engineer  tape  on  the  trees.  Here,  have  the  men  line 
up— 1,  2,  3,  4  and  5,  in  order--and  tell  the  NCOIC  to  have 

the  men  start  the  Uphill  Run  one  at  a  time  when  you  signal 
from  the  top  of  the  hill.  Then  go  up  the  hill  to  the  end  of 
the  Uphill  Run.  Signal  that  you’re  ready  with  three  blasts 
on  the  whistle,  then  one  blast  to  start  man  no.  1— and  start 
the  stopwatch!  When  man  no.  1  finishes  the  Uphill  Run,  re¬ 
cord  his  time  and  then  give  one  blast  on  the  whistle  as  a 
signal  for  man  no.  2  to  start— and  so  forth.  Now,  let's 

check  stopwatches  to  be  sure  they're  wound  and  that  you  know 
how  to  operate  the  one  you  have. 

After  the  last  man  in  the  group  finishes  the  Uphill  Run,  take 
a  15-minute  break,  then  walk  at  a  normal  pace  down  to  the 
river.  Take  a  3-minute  break  at  the  river— again  no  drinking 
from  the  river  or  filling  canteens— and  then  a  normal  walking 
pace  to  marker  283.  Take  a  5-minute  break  here,  and  then 
walk  to  the  beginning  of  the  Double  Time,  where  you'll  again 
see  the  engineer  tape  tied  to  the  trees.  Have  the  men  line 
up— you  go  to  the  end  of  the  Double  Time  course,  and  start 
and  time  them  individually,  as  you  did  on  the  Uphill  Run. 

When  the  last  man  finishes  the  Double  Time,  take  a  5-minute 
break,  then  walk  at  a  normal  pace  to  the  end  of  the  course. 

Record  ervd  time  from  your  wristwatch,  to  the  nearest  minute, 

when  the  last  man  in  the  group  reaches  the  end  of  the  course. 

As  the  group  returned  from  the  course  on  the  1st  day,  after  they 
had  been  weighed  and  had  their  canteens  weighed,  the  experimenter  took 
them,  one  at  a  time,  and  gave  them  the  following  explanation  of  magni¬ 
tude  estimation: 

I'd  like  you  to  tell  me  how  difficult  the  various  parts  of 

the  course  were  by  simply  giving  me  a  n  er.  If  you  thought 

a  part  of  the  course  was  very  hard  you  should  give  me  a  big 

number.  If  you  thought  it  was  easy,  you  should  give  me  a 
small  number.  Please  do  not  give  me  any  zeros  or  negative 

.  tilinfoers,  though.  Now,  think  about  the  Forced  March.  If  you 
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thought  it  was  very  hard,  give  me  a  big  number.  If  you 

thought  it  was  very  easy,  give  me  a  small  number. 

What  about  crossing  the  river? 

Uphill  Run? 

Double  Time? 

Walking  parts  of  the  course? 

On  succeeding  days;  6,  7  and  8  September,  a,  ,er  the  soldiers  had 
finished  the  course  and  been  weighed,  they  were  greeted  with: 

OK,  you've  been  over  the  course  and  done  this  before.  What 

about  the  Forced  March  today?  Et  cetera. 

2.  Analyses  and  Results:  Magnitude  Estimation 

The  loads  carried  by  the  soldiers  as  they  traversed  the  MPPC  were 
intended  to  influence  their  perception  of  the  difficulty  of  the  vari¬ 
ous  parts  of  the  MPPC,  in  the  same  manner  as  the  actual  lengths  of  the 
stimulus  lines  presented  to  subjects  in  the  laboratory  experiments 
were  shown  to  determine  their  perceived  line  lengths.  Magnitude  esti¬ 
mation  was  used  to  measure  the  subjective  variables  in  both  cases; 
perceived  line  lengths  in  the  laboratory  experiments,  and  perceived 
difficulties  of  various  parts  of  the  MPPC  in  the  field  experiment  now 
under  consideration.  Thus,  it  is  reasonable  to  see  how  well  the  ex¬ 
pected  relationships  between  loads  carried  over  the  MPPC  and  the  per¬ 
ceived  difficulties  of  various  parts  of  the  MPPC  can  be  represented  by 
power  functions.  Of  course,  it  cannot  be  expected  that  the  relation¬ 
ships  between  loads  carried  and  perceived  difficulties  of  various 
parts  of  the  MPPC  will  be  represented  so  well  by  power  functions  (or 
any  other  exact  functions,  for  that  matter)  as  was  found  to  be  the 
case  with  lengths  of  stimulus  lines  and  perceived  lint  lengths  in  the 
laboratory  experiments.  As  was  pointed  out  at  the  beginning  of  this 
section,  this  is  true  because  perceived  difficulties  of  the  various 
parts  of  the  MPPC  will  clearly  depend  on  a  number  of  other  physical, 
physiological  and  psychological  variables,  in  addition  to  the  inde¬ 
pendent  variable  manipulated  ir  this  field  experiment--the  loads  car¬ 
ried  by  the  soldiers  as  they  traversed  the  MPPC. 

To  see  how  well  power  functions  represent  the  relationships  be¬ 
tween  loads  carried  over  the  MPPC  and  the  perceived  difficulties  of 
various  parts  of  the  MPPC,  as  measured  by  magnitude  estimation, 
both  the  loads  carried  over  the  MPPC  and  the  geometric  means  of  the 
magnitude  estimates  were  converted  to  decibels  and  plotted  on  linear 
coordinates,  as  was  done  in  laboratory  experiments.  The  data  used  in 
these  analyses  are  those  provided  by  the  12  soldiers  who  traversed  the 
MPPC  and  provided  magnitude  estimates  of  the  difficulties  of  parts  of 
the  MPPC  on  all  4  days.  Table  12  shows  ratios  and  conversions  to  dec¬ 
ibels  of  the  stimulus  loads  carried,  and  of  the  five  kinds  of  subjec¬ 
tive  responses— the  perceived  difficulties  of  the  five  parts  of  the 
MPPC. 
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Table  12.  Ratios  and  Conversions  to  Decibels:  Magnitude 
Estimates  of  Difficulty  of  MPPC 


Response  1.  Forced  March 
r  Geom  Means 


of  Mag  Est 

Ratios 

Decibels 

3.99 

1.00 

0.00 

8.47 

2.12 

3.27 

7.44 

1.86 

2.71 

13.88 

3.48 

5.41 

Response  2. 

River  Crossing 

Geom  Means 

of  Maq  Est 

Ratios 

Decibels 

1.99 

1.00 

0.00 

3.38 

1.70 

2.30 

Stimuli 

3.61 

1.81 

2.59 

Loads 

3.80 

1.91 

2.81 

Carried 

Ratios 

Decibels 

lbs  (kg) 

Response  3.  Uphill 

Run 

22.5  (10.2) 

1.00 

0.00 

Geom  Means 

32.5  (14.7) 

1.44 

1.60  / 

of  Maq  Est 

Ratios 

Decibels 

42.5  (19.3) 

1.89 

2.76  < 

— 04 

xr 

0.OO 

52.5  (23.8) 

2.33 

3.68  N 

13.46 

1.54 

1.88 

19.08 

2.18 

3.39 

21.32 

2.44 

3.87 

Response  4 

.  Double 

Time 

Geom  Means 

of  Maq  Est 

Ratios 

Decibels 

3.95 

1.00 

0.00 

6.75 

1.71 

2.33 

5.22 

1.32 

1.21 

12.54 

3.18 

5.02 

Response  5. 

Walking  Farts 

Geom  Means 

of  Maq  Est 

Ratios 

Decibels 

- DT 

TW 

O.0O 

4.96 

1.33 

1.23 

V.  6.48 

1.73 

2.39 

9.09 

2.43 

3.86 
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When  the  decibel  values  of  the  geometric  means  of  the  magnitude 
estimates  of  the  difficulties  of  each  of  the  five  parts  of  the  MPPC 
are  plotted  in  turn  against  the  decibel  values  for  the  loads  carried, 
the  results  obtained  are  those  presented  in  figure  9.  Slopes  of  best¬ 
fitting  lines  are  generally  in  the  vicinity  of  1.00,  and  vary  from  .75 
to  1.28.  The  fit  of  the  lines  to  the  points,  and  therefore  of  the 
corresponding  power  function  to  the  data,  ranges  from  quite  good  for 
the  Uphill  Run  and  Walking  Parts  of  the  MPPC,  to  rather  poor  for  the 
Double  Time.  It  should  be  noted  that  the  ranges  of  both  the  stimulus 
variables--loads  carried,  and  the  response  variables--magnitude  esti- 
utes  of  the  parts  of  the  MPPC,  are  only  3  to  5  decibels,  as  contrast¬ 
'd  with  corresponding  ranges  of  around  20  decibels  in  the  laboratory 
experiments  on  magnitude  estimation  of  line  length.  Thus,  in  this 
experiment,  it  was  possible  to  investigate  only  relatively  small  por- 
-ions  of  the  relationships  between  loads  carried  and  perceived  diffi¬ 
culty  of  parts  of  the  MPPC. 

It  may  be  of  interest  to  know  the  results  of  an  analysis  of  dif¬ 
ferences  between  arithmetic  means  of  the  magnitude  estimates.  Table 
13  presents  the  arithmetic  means  of  the  magnitude  estimates  of  diffi¬ 
culty,  for  the  four  different  loads  carried  and  for  the  five  different 
parts  of  the  MPPC.  An  analysis  of  variance  was  performed  of  the  data 
on  which  the  means  of  table  13  are  based.  This  analysis  was  a  two 
factor  analysis  with  repeated  measures  on  both  factors.  The  results 
of  this  analysis  are  shown  in  table  14.  Though  the  means  in  table  13 
appear  to  differ  widely,  this  analysis  of  variance  shows  that  the  dif¬ 
ferences  are  not  statistically  significant;  either  for  the  parts  of 
the  MP°C,  the  loads  carried,  or  for  the  interaction  between  these  two 
factors. 

The  between-subjects  variability  of  these  magnitude  estimates  was 
large,  with  one  subject  using  only  the  range  of  1  to  4,  and  11  of  12 
subjects  using  ranges  between  1  and  60,  while  one  subject  used  a  range 
of  10  to  1,000.  The  extreme  values  of  the  magnitude  estimates  of  this 
one  subject  obviously  had  a  large  effect  on  most  of  the  arithmetic 
means  in  tabl°  13.  However,  if  the  geometric  means  in  table  12  are 
compared  with  the  corresponding  arithmetic  means  in  table  13,  it  may 
tv.  s.-en  that  the  geometric  means  were  much  less  affected  by  the  ex- 
tre-r.r  va.  'f-s  of  the  magnitude  estimates  made  by  this  one  subject. 

3 .  Analyses  and  Results:  Performance  Data 

The  times  required  to  traverse  the  various  parts  of  the  MPPC  and 
the  pe -rentage  of  initial  body  weight  lost  in  perspiration  are  objec¬ 
tive,  performance  data  yielded  by  this  field  study.  It  will  be  of 
inter  -'st  to  analyze  and  compare  the  results  with  those  from  the  magni¬ 
tude  estimation  data.  For  the  Uphill  Run,  Double  Time  and  Percent 
Initial  Body  Weight  Lost  variables,  individual  data  are  available;  for 
the  Forced  March,  Total  Time  and  Normal  Walk  variables,  the  data  are 
for  groups.  In  the  individual  performance  data,  the  data  are  for  20 
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I  ab  i  e  IS.  Arithmetic  Med  rib  of  Magn  1  tucir  estimates  ot  J  ■  •  i  ^  : ty* 
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♦Data  on  12  soldier's  who  car  •  ied  a '  '  *uu>'  Vais  ,•••  t  r»t-  *;•: 


Table  14,  Analysis  of  Variance  ot  Ma  ji"  *  ude  pst_  mates  -H  Lit? 


Source  of  Variance 

SS 

df 

ME  i  p 

Between  Subjects 

3,529,975.65 

11 

Within  Subjects 

2,227,199.85 

2  2  Ei 

Parts  of  Course 

159,960.48 

4 

39,990. 12  1 .44 

£  rror-P 

1,220,646.62 

44 

,'7.741 .97 

Loads 

61,042.58 

3 

20, 34  7. 53  i .  O' 

Error-L 

W3 ,298  .07 

33 

15, 2b 1.4b 

P  X  L 

'29,936.19 

12 

.  ,494.68  r . ■  . 

Error-P  X  L 

132 

1,911 .48 

total 

5,7^,W5.50 

TT9 

♦Data  on  12  soldiers 

who  carried  all 

f  our 

■  loads  over  the  MPPc . 

♦♦Not  significant. 

P 

aT> 

O 

A 

subjects  on  each 

of 

tne  4  days;  i. 

e.. 

data  from  substitutes  are  n- 

eluded.  This  was 

done  because,  in 

the 

performance  events,  the  data 

obtained  from  substitutes  were  much  more  similar  to  those  obtained 
from  the  regular  subjects,  than  was  the  case  with  the  magnitude  esti¬ 
mation  data  where  a  substitute  often  would  use  a  very  different  range 
of  numbers  from  that  used  by  the  regular  subject  whom  he  replaced. 


The  arithmetic  means  of  the  performance  variables  f or  the  t 
loads  carried  over  the  MPPC  are  presented  in  table  15.  An  analys' 
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vd-''ance  was  carded  out  on  tne  data  fur  each  performance  variable. 
Tnes--  analyses  were  of  two  Kinds.  For  the  individual  data,  two-factor 
a1  a  j  •  wr re  car--  ed  out  with  repeated  measures  on  one  of  the  factors. 
Tne  between -suDj.ee ts  f  ac tor  is  the  groups  factor;  i.e.,  the  four  groups 
n  when  tne  subjects  traversed  the  MPPC.  The  within-subjects  factor 
>  fne  1  oads -tarr-ied  factor,  for  the  group  data,  tne  analyses  were 
, '  1 1  e- f  as  t  or  nalyses  w'th  -epeated  measures  on  the  1  oads-carr  i  ed 
•  at  ‘  1  r  . 
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o.  Arithmetic  Means  of  Performance  Variables 

N 

?r-  30- 

40- 

SO-  groups  P. 

r  suns 

!  v .  s  <  -  ' : 

4S  'bs 

55  lbs 

*  »r 

n an.  »• 

* ,  1  -  .3.6- 

, :  b  .  :  - 

.22./- 

V  dr  •  db 

JfS 

00.4  xj 

24.9  xq; 

^  jr  .  t?t! 

far  ;  h 

'  .  "  *6  .  £ 

14  .  75 

32. SO  4 

,  M  1  n 

=  pr. 

-  uf 

’'1  .St;  bo  .  4  S 

’>  T  .95 

8 7 .40 

20 

il  ’  e 

,  <-»' 

■  ne 

.  u.DO  T7.S0 

H.20 

1 9 .  h0 

20 

*.  -1  ' 

■  m» 

;  .  ju  .  j ; .  so 

,b4.S0 

.  5  2 .  ’  6  4 

V  1  n 

%  ,f*-  *  • 

Mi  n 

w  1  • 

4-;  .  s  ■  ; .  -1 

70. 7 5 

65.  cS  4 

I  H  *  *  M 

Body 

2.4  j  0.49 

2.57 

2. 48 

20 

we-gnt  st 

% 

’he  ana'yses  of  variance  of  tne  ndvidual  lata  ; Uph 1 1 1  Run,  Dou- 
e  T  ime  and  Per-:ent  imtia  Body  Weight  Lost)  are  presented  in  table 
lo.  It  can  be  seen  that  differences  between  average  times  required  to 
ove*  the  Jpn i '  ’  Run  and  the  Double  Time  events,  for  different  leads 
amed  were  significant.,  but  tVferen  es  between  averages  of  Percent 
!  n  1 1 1  a '  Body  we  ght  i_ost  we^e  not.  These  analyses  also  show  signifi- 
■  ant  interact  i  ons  between  groups  and  loads  earned  on  both  the  Uphill 
■fn  and  Double  T -me  everts.  TVs  means  that  the  relationship  between 
loads  lamed  and  performance  on  these  events  is  significantly  differ¬ 
ent  for  at  ‘east  one  group,  as  compared  with  the  other  groups.  To 
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Table  16.  Analyses  of 

Variance  of 

Individual  Scores 

on  MPPC 

_ 2 _ 

Source  of  Variation 

Uphill  Run 

SS 

df 

MS 

F 

Between  Subjects 

14,624.14 

19 

A  (Groups) 

7,744.04 

3 

2581.35' 

6.00 

<.01 

Subjects  within  Groups 

6,880.10 

16 

430.01 

Within  Subjects 

44,938.25 

60 

B  (Loads) 

8,244.74 

3 

2748.25 

8.26 

<.001 

AB 

20,715.21 

9 

2301.69 

6.91 

<.001 

B  subjects  within  Groups 

15,978.30 

48 

332.88 

TOT 

59,562.39 

79 

Double  Time 

Source  of  Variation 

SS 

df 

MS 

F 

P 

Between  Subjects 

1,3 15.55 

19 

A  (Groups) 

1,068.15 

3 

356.05 

23.03 

<.001 

Subjects  within  Groups 

247.40 

16 

15.46 

Within  Subjects 

4,634.00 

60 

B  (Loads) 

1,020.55 

3 

340.18 

46.41 

<.001 

AB 

3,261.65 

9 

362.41 

49.44 

<.001 

B  X  Subjects  within  Groups 

351.80 

48 

7.33 

TOTAL 

5,949.55 

79 

Percent  Initial  Body  Weight  Lost 

Source  of  Variation 

SS 

df 

MS 

F 

P 

Between  Subjects 

25.35 

19 

A  (Groups) 

6.40 

3 

2.13 

1.81 

n.s* 

Subjects  within  Groups 

18.95 

16 

1.18 

Within  Subjects 

29.43 

60 

8  (Loads) 

0.19 

3 

0.06 

<1.00 

n.  s 

AB 

9.21 

9 

1.02 

2.43 

n.s 

B  X  Subjects  within  Groups 

20.03 

48 

0.42 

TOTAL 

54.78 

79 

*Not  significant,  p  >.05 


51 


investigate  this,  arithmetic  means  for  each  of  the  four  groups  were 
computed,  for  each  load  carried.  For  the  'Jph  i  1 1  Run,  these  means  are 
presented  in  table  17.  1  igur-_-  10  shows  a  plot  of  these  means  and  from 
this  plot  it  is  immediately  obvious  that  the  performance  of  the  CS 
Company  group  when  carry! *-g  50-  to  55-pound  loads  (on  the  last  of  the 
4  days)  is  very  different  than  would  be  expected  from  the  performances 
of  the  other  three  groups  .  a  d  jt  the  CS  group  when  carrying  other 
loads).  When  the  data  we*v  r.j '  _,<•••  :  jnj  th’s  became  apparent,  an  in¬ 
quiry  was  started  to  tr ,  -■  -,t.  >,-..sors  for  this  deviant  per¬ 
formance.  No  clear-cut  vCr.i:  •  t  but  the  following  rea¬ 

sons  were  suggested,  $•  v;e«*n  it  •  ; ..  at  ,e  :  , . 

(a)  The  men  of  tn*  1  c.r  ,.  woo  *  >*ed  in  a  variety  of  main¬ 
tenance  and  support  job  ,  **  •  •  ■<  '  as  we ,  conditioned  physically  as 

the  men  in  the  other  u"\  t  *u  nj.:  t »-  j > ned  extensively  in  the  jun¬ 
gle. 

(b)  The  men  m  tne  *-.•  *  *.  v  ’  •  •  were  challenged  by  cne  MPPC, 

and  heavier  loads  appear-.-:  ,  •.  .  >  greater  determination  in 

them;  while  the  men  of  tne  C  y%  ar.y  -  e.:arded  traversing  the  MPPC  as 
almost  a  punishment,  and  the  5C-  t  t>5-puun:i  Icods  were  considered  as 
the  "last  straw."  In  otner  pa'  *.  .  *  me  MPPC,  when  carrying  the  50- 

to  55-pound  loads,  the  CS  group  had  hi  go  nut  not  the  highest)  times, 
but  the  Uphill  Run  was  much  more  demand’”;  than  the  other  parts  of  the 
MPPC,  and  it  is  possible  that  the  morale  r(  this  group  suffered  con¬ 
siderably  at  this  point. 

The  arithmetic  means  on  the  double  lime  event  of  the  four 

groups,  for  each  load  carried,  are  presented  in  table  18.  Figure  11 
shows  a  plot  of  these  means  and  again  the  deviant  performance  is  per¬ 
fectly  obvious.  This  time  it  was  the  Company  C  group  when  carrying  30 
to  35  pounds  (on  the  first  of  the  4  days)  that  performed  very  differ¬ 
ently  than  the  other  groups,  and  also  very  differently  than  they  (Com¬ 
pany  C)  did  when  carrying  other  loads.  Inquire  -'v'.iled  a  dear -cut 
reason  for  this  different  performance;  the  timet  wh  -umpanien  them 
on  this  first  day  remembered  that  this  group  had  worn  combat  foots 
instead  of  jungle  boots  on  this  one  day,  and  had  slipped  a  good  leal 
while  traversing  the  course.  Detailed  examination  of  the  data  *  or 
other  parts  of  the  MPPC  showed  that  on  this  day  tne  Company  C  gr.oip 

had  higher  times  than  any  of  the  other  groups  did  while  carrying  30  to 
35  pounds,  and  that  the  Company  C  group  had  a  higher  average  score  for 
Percent  Initial  body  Weight  Lost  on  this  day  than  they  did  on  an*  of 
the  other  3  days.  The  Company  C  group  average  for  Percent  Inri' 
Body  Weight  Lost  was  also  considerably  higher  on  this  day  than  *  *>o 
averages  for  the  other  three  groups  were  on  any  of  the  4  days.  m.i 
it  is  clear  that  the  relatively  poor  traction  of  the  combat  boo*  id 
a  considerable  effect  on  the  performance  of  this  group,  who  nuo  :d- 
vertently  worn  them  on  this  one  day.  Why  it  should  have  aft- 
their  performance  on  the  Double  Time  event  so  much  more  than  it  or  on 

other  parts  of  the  MPPC  is  not  clear,  unless  it  is  because  the  '.v.:  V 

Time  event  is  near  the  end  of  the  MPPC,  and  the  group  may  nave  :  >-eu 
very  tired  at  that  point. 


5? 


Performance  in  Seconds 


Table  18.  Double  Time:  Arithmetic  Means  for  Four  Groups 


Groups 

A 

B 

C 

CS 


20-25  lbs 
17.00  sec 
19.80  sec 
15.20  sec 
24.00  sec 


Loads  Carried 
30-35 Tbs  40-45  Tbs 

17.20  sec  17.20  sec 

19.60  sec  18.60  sec 

52.00  sec  21.00  sec 

21.20  sec  20.00  sec 


50-55  lbs 
18.80  sec 
18.20  sec 
20.40  sec 
21.00  sec 


Figure  11.  Double  Time:  Loads  Carried,  Arithmetic 
Means  for  Four  Groups  Plotted. 
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The  analyses  of  variance  for  the  group  data  (Forced  March,  Total 
Time  and  Normal  Walk  variables)  are  presented  in  table  19.  These 
analyses  show  that  the  differences  between  average  times  for  the 
groups  on  the  Forced  March,  Total  Time  and  Normal  Walk  variables,  when 
carrying  different  loads,  all  differed  significantly. 


Table  19. 

Analyses  of 

Variance  for  Group 

Scores 

on  MPPC 

Forced  March 

Source  o*  Variance 

SS 

df 

MS 

F 

P 

Between  Groups 

32.75 

3 

— 

-- 

Within  Groups 

357.00 

12 

— 

— 

— 

Loads 

207.25 

3 

69.08 

4.15 

<.05 

Error 

149.75 

9 

_  16.64 

TOTAL 

389.75 

15 

Total 

Time 

Source  of  Variance 

SS 

df 

MS 

F 

P 

Between  Groups 

95.19 

3 

— 

-- 

Within  Groups 

4821.75 

12 

— 

-- 

— 

Loads 

2898.19 

3 

966.06 

4.52 

<.05 

Error 

1923.56 

9 

_  213.73 

TOT '  . 

4916.94 

15 

Normal 

Walk 

Source  of  Variance 

SS 

df 

MS 

F 

P 

Between  Groups 

201.50 

3 

— 

— 

Within  Groups 

2286.50 

12 

-- 

— 

-- 

Loads 

1381.00 

3 

460.33 

4.58 

<.05 

Error 

905.50 

9 

_  100.61 

TOTAL 

2488.00 

15 

4.  Discussion 

In  summary,  magnitude  estimates  of  the  difficulty  of  the  Uphill 
Run  and  Walking  Parts  of  the  MPPC  are  fitted  quite  well  by  power  func¬ 
tions  of  loads  carried,  with  exponents  of  approximately  1.00.  The  fit 
for  magnitude  estimates  of  the  difficulty  of  River  Crossing  is  less 
satisfactory,  and  those  for  Forced  March  and  Double  Time  are  definite¬ 
ly  poorer.  Keeping  in  mind  the  expectations  (discussed  at  the  begin¬ 
ning  of  this  section)  that  perceived  difficulty  of  traversing  parts  of 
the  MPPC  would  be  affected  by  several  variables  in  addition  to  the 
loads-carried  variable,  these  results  are  a  reasonably  satisfactory 
outcome  of  this  part  of  the  field  experiment.  It  is  entirely  possible 
that  magnitude  estimation  did  measure  quite  well  the  soldiers'  subjec¬ 
tive  perceptions  of  the  difficulty  of  various  parts  of  the  MPPC,  and 


that  these  perceptions  were  in  some  cases  affected  by  other  factors  sc 
that  the  relationships  between  loads  carried  and  the  magnitude  esti¬ 
mates  deviated  from  the  expected  power  functions. 

The  failure  of  magnitude  estimates  of  difficulty  of  the  parts  of 
the  MPPC  to  show  significant  relationships  with  loads  carried,  when 
tested  by  analysis  of  variance,  is  probably  because  the  four  loads 
used  in  this  study  cover  a  limited  segment  of  the  possible  range  of 
loads.  A  study  using  both  lighter  and  heavier  loads  t^ari  those  used 
in  this  study  would  almost  certainly  show  a  significant  relationship 
between  loads  carried  and  magnitude  estimates  of  difficulty. 

Analyses  of  the  performance  data  did  show  that  the  loads-carried 
factor  was  significant  for  all  variables  except  Percent  Initial  Body 
Weight  lost.  However,  on  the  Uphill  Run  and  Double  Time  variables, 
where  individual  data  permitted  analysis  of  the  Groups  by  Loads  inter¬ 
action,  it  appeared  that  uncontrolled  factors--such  as  the  wearing  of 
combat  boots  instead  of  jungle  boots  by  the  Company  C  group  on  one  day 
and  perhaps  differences  in  physical  conditioning  and  attitudes  between 
the  soldiers  from  the  line  companies  (A,  B  and  C)  and  thoee  from  the 
CS  Company— accounted  for  substantial  variations  in  performance.  Ex¬ 
amination  of  table  15  shows  that  for  none  of  the  performance  variables 
was  there  a  reasonably  regular  increase  in  means  from  lighter  to  heav¬ 
ier  loads.  Thus,  the  statistical  significance  of  the  loads  carried 
factor  in  these  analyses  is  probably  due  to  uncontrolled  factors,  such 
as  those  described  above,  rather  than  reliable  effects  of  the  various 
loads  carried. 
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SECTION  VI.  DISCUSSION  AND  CONCLUSIONS 


Magnitude  estimation  was  selected  as  the  most  promising  response 
mode  for  evaluation  in  human  factors  settings,  after  a  survey  of 
cross-modality  matching  studies,  because  of  its  convenience  and  be¬ 
cause  of  universal  familiarity  with  the  number  scale.  In  a  series  of 
laboratory  studies  of  magnitude  estimation  of  line  length,  it  was 
found  that  the  data  obtained  by  magnitude  estimation  were  very  well 
fitted  by  a  power  function,  and  that  the  exponent  of  the  power  func¬ 
tion  which  best  fitted  the  data  was  approximately  0.91  to  0.94.  These 
results  compared  very  well  with  those  reported  in  the  literature 
(Stevens,  1975);  thus  the  experimenter  was  assured  that  the  technique 
was  being  applied  correctly  in  the  laboratory  studies.  The  data  on 
magnitude  estimation  of  line  length  were  examined  for  evidence  of  the 
stability  (reliability)  of  measurement  and  it  was  found  that  stability 
of  measurement  (at  the  level  of  group  means  with  N  =  12)  was  quite 
satisfactory.  Particularly  impressive  were  the  intraclass  correlation 
coefficients  of  0.94  for  the  confirmatory  experiment  on  magnitude  es¬ 
timation  of  line  length. 

Magnitude  estimates  for  a  group  of  subjects  were  found  to  vary 
widely  with  most  of  the  estimates  in  a  lower  range  of  perhaps  1  to  10 
up  to  1  to  60,  and  a  few  estimates  using  much  larger  numbers.  This  is 
what  would  be  expected  with  measurement  on  a  ratio  scale,  instead  of 
on  an  ordinal  scale  as  yielded  by  other  methods  of  measuring  subjec-  f 

ti>.  .  -iables.  For  example,  consider  the  following  geometric  ser¬ 
ies :  j,  2,  4,  8,  16,  32,  64,  128,  256,  in  which  the  ratio  of  each 

term  to  the  preceding  one  is  constant  and  equals  2  in  this  case.  When 
sampling  from  a  population  of  values  with  a  distribution  of  this  kind, 
it  would  be  expected  that  most  values  sampled  would  be  in  the  lower 
range,  with  a  few  much  higher  values,  as  was  found  to  be  the  case  with 
magnitude  estimates.  That  the  geometric  mean,  rather  than  the  arith¬ 
metic  mean,  is  the  appropriate  measure  of  central  tendency  for  data  of 
this  kind  is  shown  by  the  fact  that  the  geometric  mean  of  such  a  ser¬ 
ies  of  values  will  be  the  middle  value  in  the  series--16  in  the  exam¬ 
ple  above.  The  arithmetic  mean  will  be  unduly  influenced  by  the  few 
extremely  high  values,  and  is  56.78  in  the  example  above. 

It  is  not  likely  that  direct  evidence  can  be  obtained  that  will 
make  it  intuitively  obvious  that  subjective  measurement  by  magnitude 
estimation  is  measurement  on  a  ratio  scale.  Because  the  object  of 
measurement  is  subjective,  it  is  concealed  within  the  private  mental 
processes  of  each  individual.  In  the  future,  sufficient  information 
may  be  available  on  neural  functioning  so  that  the  problem  can  be  ap¬ 
proached  from  this  direction.  But  at  present,  it  is  not  intuitively 
obvious  that  the  units  of  the  subjective  scale  are  equal  and  that 
there  is  a  natural  zero  point  on  this  scale.  Nevertheless,  the  evi¬ 
dence  that  magnitude  estimates  are  distributed  as  a  geometric  series,' 
cited  above,  points  in  the  direction  of  the  subjective  scale  being  a 
ratio  scale. 
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On  the  practical  side,  magnitude  estimates  were  easily  gathered  ’r 
the  preceding  section.  Of  course,  it  is  necessary  that  the  persrr, 
who  are  to  make  the  magnitude  estimates  understand  clearly  the  v  .•  1 
able  on  which  they  are  to  provide  data:  comfort,  difficulty,  cum 
dence,  or  whatever  it  may  be.  And  they  should  also  understand  that 
they  may  use  any  numbers  they  wish  (except  for  zero  and  negative  num¬ 
bers);  i.e.,  they  should  not  restrict  themselves  to  any  prcc .  mc-i  ved 
range,  such  as  1  to  10. 

Also,  it  should  be  clearly  understood  by  the  expe.  letter  or  test¬ 
er  that  suitable  experimental  designs  and  methods  of  .atistical  anal¬ 
ysis  should  be  used.  Generally,  the  principles  of  experimental  de¬ 
sign,  such  as  suitable  sampling,  control  and  randomization  procedures, 
apply  when  using  magnitude  estimation,  just  as  they  do  in  any  other 
test  or  investigation.  It  is  not  definite  at  this  point  that  the  usu¬ 
al  arithmetic  mean  and  variance  estimating  statistical  methods  are 
entirely  suitable  for  use  with  magnitude  estimates.  They  were  used  in 
this  study  because  no  procedures  were  available  for  testing  for  dif¬ 
ferences  between  geometric  means. 

In  the  field  studies  described  in  the  preceding  section,  magnitude 
estimation  yielded  results  that  generally  made  sense  and  agreed  with 
available  corroborating  information.  Differences  in  magnitude  esti¬ 
mates  between  treatment  groups  were  sometimes  clear-cut,  as  in  the 
comparisons  of  the  helmets  and  vests,  and  sometimes  less  so,  as  in  the 
comparisons  of  machine  guns. 

This  study  has  clearly  shown  that  magnitude  estimation  can  be  used 
to  measure  subjective  variables  in  human  factors  evaluations.  It  ap¬ 
pears  to  be  superior  to  the  rating  and  ranking  methods  used  for  many 
years,  in  that  it  achieves  measurement  on  a  ratio  scale,  as  compared 
with  the  ordinal  scale  measurement  achieved  by  the  rating  and  ranking 
methods . 

The  author  suggests  that  magnitude  estimation  be  used  in  human 
factors  evaluations,  along  with  the  usual  rating  and  ranking  methods 
of  measuring  subjective  variables,  so  that  the  practical  usefulness 
and  validity,  in  terms  of  present  day  practices  and  standards,  may  be 
better  assessed.  When  it  is  possible  to  compare  the  results  of  mag¬ 
nitude  estimation  with  objective  measures  of  performance  that  would  be 
expected  to  be  related  to  subjective  variables,  this  should  be  done, 
since  objective  measures  of  performance  provide  a  tetter  basis  for 
evaluation  of  magnitude  estimation  than  do  the  usual  rating  and  rank¬ 
ing  methods  of  measuring  subjective  variables.  Further  empirical  and 
theoretical  investigations  should  be  carried  out  on  the  sampling  dis¬ 
tributions  of  magnitude  estimates  and  on  the  suitability  (or  unsuita¬ 
bility)  of  the  usual  arithmetic  mean  and  variance  estimation  statisti¬ 
cal  methods  for  use  with  magnitude  estimates.  If  necessary,  statisti¬ 
cal  testing  procedures  should  be  developed  for  differences  between 
geometric  means. 
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